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METHODS OF DIAGNOSIS OF PROSTATE CANCER, 
COMPOSITIONS AND METHODS OF SCREENING FOR 
MODULATORS OF PROSTATE CANCER 

5 

CROSS-REFERENCES TO RELATED APPLICATIONS 

This application claims priority from the following applications: USSN 
09/687,576 filed October 13, 2000, USSN 60/276,791 filed March 16, 2001; USSN 
60/288,589, filed May 4, 2001; USSN 09/733,742, filed December 8, 2000; USSN 
10 09/733,288, filed December 8, 2000; USSN 09/847,046, filed April 30, 2001; USSN 
60/276,888, filed March 16, 2001; USSN 60/286,214, filed April 24, 2001; USSN 
60/281,922, filed April 6, 2001; USSN 60/263,957, filed January 24, 2001, which ate 
incorporated herein by reference in their entirety. 

FIELD OF THE INVENTION 

The invention relates to the identification of nucleic acid and protein 
expression profiles and nucleic acids, products, and antibodies thereto that are involved in 
prostate cancer; and to the use of such expression profiles and compositions in the diagnosis, 
prognosis and therapy of prostate cancer. The invention further relates to methods for 
identifying and using agents and/or targets that inhibit prostate cancer. 



15 



20 



BACKGROUND OF THE INVENTION 
Prostate cancer is the most commonly diagnosed internal malignancy and 
second most common cause of cancer death in men in the U.S ., resulting in approximately 
25 40,000 deaths each year ( Landis et at, CA Cancer J. Clin. 48:6-29 (1998); Greenlee et al., 
CA Cancer J. Clin. 50(1):7-13 (2000)), and incidence of prostate cancer has been increasing 
rapidly over the past 20 years in many parts of the world (Nakata et al., Int. J. Urol 
7(7):254-257 (2000); Majeed et al., BJUInt. 85(9): 1058- 1062 (2000)). It develops as the 
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result of a pathologic transformation of normal prostate cells. Li tumorigenesis, the cancer 
cell undergoes initiation, proliferation and loss of contact inhibition, culminating in invasion 
of surrounding tissue and, ultimately, metastasis. 

Deaths from prostate cancer are a result of metastasis of a prostate tumor. 
5 Therefore, early detection of the development of prostate cancer is critical in reducing 

mortality from this disease. Measuring levels of prostate-specific antigen (PSA) has become 
a very common method for early detection and screening, and may have contributed to the 
slight decrease in the mortality rate from prostate cancer in recent years (Nowroozi et al., 
Cancer Control 5(6):522-531 (1998)). However, many cases are not diagnosed until the 

10 disease has progressed to an advanced stage. 

Treatments such as surgery (prostatectomy) , radiation therapy, and 
cryotherapy are potentially curative when the cancer remains localized to the prostate. 
Therefore, early detection of prostate cancer is important for a positive prognosis for 
treatment Systemic treatment for metastatic prostate cancer is limited to hormone therapy 

15 and chemotherapy. Chemical or surgical castration has been the primary treatment for 
symptomatic metastatic prostate cancer for over SO years. This testicular androgen 
deprivation therapy usually results in stabilization or regression of the disease (in 80% of 
patients), but progression of metastatic prostate cancer eventually develops (Panvichian et al., 
Cancer Control 3(6):493-500 (1996)). Metastatic disease is currently considered incurable, 

20 and the primary goals of treatment are to prolong survival and improve quality of life (Ragp, 
Cancer Control 5(6):513-521 (1998)). 

Thus, methods that can be used for diagnosis and prognosis of prostate cancer 
and effective treatment of prostate cancer, and including particularly metastatic prostate 
cancer, would be desirable. Accordingly, provided herein are methods that can be used in 

25 diagnosis and prognosis of prostate cancer. Further provided are methods that can be used to 
screen candidate bioactive agents for the ability to modulate, e.g., treat, prostate cancer. 
Additionally, provided herein are molecular targets and compositions for therapeutic 
intervention in prostate cancer and other cancers. 



2 



WO 02/30268 



PCT/US01/32045 



SUMMARY OF THE INVENTION 
The present invention therefore provides nucleotide sequences of genes that 
are up- and down-regulated in prostate cancer cells. Such genes are useful for diagnostic 
purposes, and also as targets for screening for therapeutic compounds that modulate prostate 
5 cancer, such as hormones or antibodies. Other aspects of the invention will become apparent 
to the skilled artisan by the following description of the invention. 

In one aspect, the present invention provides a method of detecting a prostate 
cancer-associated transcript in a cell from a patient, the method comprising contacting a 
biological sample from the patient with a polynucleotide that selectively hybridizes to a 
10 sequence at least 80% identical to a sequence as shown in Tables 1-16. 

In one embodiment, the present invention provides a method of determining 
the level of a prostate cancer associated transcript in a cell from a patient 

Li one embodiment, the present invention provides a method of detecting a 
prostate cancer-associated transcript in a cell from a patient, the method comprising 
15 contacting a biological sample from the patient with a polynucleotide that selectively 
hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16. 

Li one embodiment, the polynucleotide selectively hybridizes to a sequence at 
least 95% identical to a sequence as shown in Tables 1-16. Li another embodiment, the 
polynucleotide comprises a sequence as shown in Tables 1-16. 
20 Li one embodiment, the biological sample is a tissue sample. In another 

embodiment, the biological sample comprises isolated nucleic acids, e.g., mRNA. 

Li one embodiment, the polynucleotide is labeled, e.g„ with a fluorescent 

label. 

Li one embodiment, the polynucleotide is immobilized on a solid surface. 
25 In one embodiment, the patient is undergoing a therapeutic regimen to treat 

prostate cancer. Li another embodiment, the patient is suspected of having metastatic 
prostate cancer. 

In one embodiment, the patient is a human. 

Li one embodiment, the patient is suspected of having a taxol-resistant cancer. 
30 Li one embodiment, the prostate cancer associated transcript is mRNA. 
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In one embodiment, the method further comprises the step of amplifying 
nucleic acids before the step of contacting the biological sample with the polynucleotide. 

In another aspect, the present invention provides a method of monitoring the 
efficacy of a therapeutic treatment of prostate cancer, the method comprising the steps of: (i) 
5 providing a biological sample from a patient undergoing the therapeutic treatment; and (ii) 
determining the level of a prostate cancer-associated transcript in the biological sample by 
contacting the biological sample with a polynucleotide that selectively hybridizes to a 
sequence at least 80% identical to a sequence as shown in Tables 1-16, thereby monitoring 
the efficacy of the therapy. In a further embodiment, the patient has metastatic prostate 
10 cancer. In a further embodiment, the patient has a drug resistant (e.g., taxol resistant) form of 
prostate cancer. 

In one embodiment, the method further comprises the step of: (iii) comparing 
the level of the prostate cancer-associated transcript to a level of the prostate cancer- 
associated transcript in a biological sample from the patient prior to, or earlier in, the 

15 therapeutic treatment 

Additionally, provided herein is a method of evaluating the effect of a 
candidate prostate cancer drug comprising administering the drug to a patient and removing a 
cell sample from the patient. The expression profile of the cell is then determined. This 
method may further comprise comparing the expression profile to an expression profile of a 

20 healthy individual. In a preferred embodiment, said expression profile includes a gene of 
Tables 1-16. 

In one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1-16. 

In one embodiment, an expression vector or cell comprises the isolated nucleic 

25 acid. 

In one aspect, the present invention provides an isolated polypeptide which is 
encoded by a nucleic acid molecule having polynucleotide sequence as shown in Tables 1-16. 

Li another aspect, the present invention provides an antibody that specifically 
binds to an isolated polypeptide which is encoded by a nucleic acid molecule having 
30 polynucleotide sequence as shown in Tables 1-16. 
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In one embodiment, the antibody is conjugated to an effector component, e.g., 
a fluorescent label, a radioisotope or a cytotoxic chemical. 

In one embodiment, the antibody is an antibody fragment, In another 
embodiment, the antibody is humanized. 
5 In one aspect, the present invention provides a method of detecting a prostate 

cancer cell in a biological sample from a patient, the method comprising contacting the 
biological sample with an antibody as described herein. 

Li another aspect, the present invention provides a method of detecting 
antibodies specific to prostate cancer in a patient, the method comprising contacting a 
10 biological sample from the patient with a polypeptide encoded by a nucleic acid comprising a 
sequence from Tables 1-16. 

In another aspect, the present invention provides a method for identifying a 
compound that modulates a prostate cancer-associated polypeptide, the method comprising 
the steps of: (i) contacting the compound with a prostate cancer-associated polypeptide, the 
15 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 
80% identical to a sequence as shown in Tables 1-16; and (ii) determining the functional 
effect of the compound upon the polypeptide. 

In one embodiment, the functional effect is a physical effect, an enzymatic 
effect, or a chemical effect 
20 In one embodiment, the polypeptide is expressed in a eukaryotic host cell or 

cell membrane. In another embodiment, the polypeptide is recombinant 

In one embodiment, the functional effect is determined by measuring ligand 
binding to the polypeptide. 

In another aspect, the present invention provides a method of inhibiting 
25 proliferation of a prostate cancer-associated cell to treat prostate cancer in a patient, the 

method comprising the step of administering to the subject a therapeutically effective amount 
of a compound identified as described herein. 

In one embodiment, the compound is an antibody. 

In another aspect, the present invention provides a drug screening assay 
30 comprising the steps of: (i) administering a test compound to a mammal having prostate 

cancer or to a cell sample isolated therefrom; (ii) comparing the level of gene expression of a 
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polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 
as shown in Tables 1-16 in a treated cell or mammal with the level of gene expression of the 
polynucleotide in a control cell sample or mammal, wherein a test compound that modulates 
the level of expression of the polynucleotide is a candidate for the treatment of prostate 
5 cancer. 

In one embodiment, the control is a mammal with prostate cancer or a cell 
sample therefrom that has not been treated with the test compound In another embodiment, 
the control is a normal cell or mammal. 

In one embodiment, the test compound is administered in varying amounts or 
10 concentrations. In another embodiment, the test compound is administered for varying time 
periods. In another embodiment, the comparison can occur after addition or removal of the 
drug candidate. 

In one embodiment, the levels of a plurality of polynucleotides that selectively 

hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1-16 are 
15 individually compared to their respective levels in a control cell sample or mammal. In a 

preferred embodiment the plurality of polynucleotides is from three to ten. 

In another aspect, the present invention provides a method for treating a 

mammal having prostate cancer comprising administering a compound identified by the 

assay described herein. 
20 In another aspect, the present invention provides a pharmaceutical 

composition for treating a mammal having prostate cancer, the composition comprising a 

compound identified by the assay described herein and a physiologically acceptable 

excipient 

In one aspect, the present invention provides a method of screening drug 
25 candidates by providing a cell expressing a gene that is up- and down-regulated as in a 
prostate cancer. In one embodiment, a gene is selected from Tables 1-16. The method 
further includes adding a drug candidate to the cell and determining the effect of the drug 
candidate on the expression of the expression profile gene. 

In one embodiment, the method of screening drug candidates includes 
30 comparing the level of expression in the absence of the drug candidate to the level of 
expression in the presence of the drug candidate, wherein the concentration of the drug 
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candidate can vary when present, and wherein the comparison can occur after addition or 
removal of the drug candidate. In a preferred embodiment, the cell expresses at least two 
expression profile genes. The profile genes may show an increase or decrease. 

Also provided is a method of evaluating the effect of a candidate prostate 
5 cancer drug comprising administering the drug to a transgenic animal expressing or 

over-expressing the prostate cancer modulatory protein, or an animal lacking the prostate 
cancer modulatory protein, for example as a result of a gene knockout 

Moreover, provided herein is a biochip comprising one or more nucleic acid 
segments of Tables 1-16, wherein the biochip comprises fewer than 1000 nucleic acid probes. 

10 Preferably, at least two nucleic acid segments are included More preferably, at least three 
nucleic acid segments are included. 

Furthermore, a method of diagnosing a disorder associated with prostate 
cancer is provided. The method comprises determining the expression of a gene of Tables 1- 
16, in a first tissue type of a first individual, and comparing the distribution to the expression 

15 of the gene from a second normal tissue type from the first individual or a second unaffected 
individual. A difference in the expression indicates that the first individual has a disorder 
associated with prostate cancer. 

In a further embodiment, the biochip also includes a polynucleotide sequence 
of a gene that is not up- and down-regulated in prostate cancer. 

20 In one embodiment a method for screening for a bioactive agent capable of 

interfering with the binding of a prostate cancer modulating protein (prostate cancer 
modulatory protein) or a fragment thereof and an antibody which binds to said prostate 
cancer modulatory protein or fragment thereof. In a preferred embodiment, the method 
comprises combining a prostate cancer modulatory protein or fragment thereof, a candidate 

25 bioactive agent and an antibody which binds to said prostate cancer modulatory protein or 
fragment thereof. The method further includes determining the binding of said prostate 
cancer modulatory protein or fragment thereof and said antibody. Wherein there is a change 
in binding, an agent is identified as an interfering agent. The interfering agent can be an 
agonist or an antagonist Preferably, the agent inhibits prostate cancer. 

30 Also provided herein are methods of eliciting an immune response in an 

individual. In one embodiment a method provided herein comprises administering to an 
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individual a composition comprising a prostate cancer modulating protein, or a fragment 
thereof. Li another embodiment, the protein is encoded by a nucleic acid selected from those 
of Tables 1-16. 

Further provided herein are compositions capable of eliciting an immune 
5 response in an individual. In one embodiment, a composition provided herein comprises a 
prostate cancer modulating protein, preferably encoded by a nucleic acid of Tables 1-16, or a 
fragment thereof, and a pharmaceutically acceptable carrier. In another embodiment, said 
composition comprises a nucleic acid comprising a sequence encoding a prostate cancer 
modulating protein, preferably selected from the nucleic acids of Tables 1-16, and a 

10 pharmaceutically acceptable carrier. 

Also provided are methods of neutralizing the effect of a prostate cancer 
protein, or a fragment thereof, comprising contacting an agent specific for said protein with 
said protein in an amount sufficient to effect neutralization. In another embodiment, the 
protein is encoded by a nucleic acid selected from those of Tables 1-16. 

15 In another aspect of the invention, a method of treating an individual for 

prostate cancer is provided. In one embodiment, the method comprises administering to said 
individual an inhibitor of a prostate cancer modulating protein. In another embodiment, the 
method comprises administering to a patient having prostate cancer an antibody to a prostate 
cancer modulating protein conjugated to a therapeutic moiety. Such a therapeutic moiety can 

20 be a cytotoxic agent or a radioisotope. 

DETAILED DESCRIPTION OF THE INVENTION 
In accordance with the objects outlined above, the present invention provides 
novel methods for diagnosis and prognosis evaluation for prostate cancer (PC), including 
25 metastatic prostate cancer, as well as methods for screening for compositions which modulate 
prostate cancer. Also provided are methods for treating prostate cancer. 

In addition to the other nucleic acid and peptide sequences, the present 
invention also relates to the identification of PAA2 as a gene that is highly over expressed in 
prostate cancer patient tissues. PAA2 sequence is identical to the zinc transporter ZNT4. 
30 Results presented herein demonstrate that PAA2/ZNT4 is highly expressed in prostate cancer 
cells. The prostate gland is unique in that it has the highest capacity of any organ in the body 
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to accumulate zinc. Zinc uptake is regulated by prolactin and testosterone, which induce the 
expression of a member of the ZIP family of zinc transporters (Costello et al., 1999, J. Biol. 
Chem. 274:17499-17504). Zinc accumulation in the prostate functions to inhibit citrate 
oxidation, which results in a decrease in cellular ATP production (Costello and franklin, 
5 1998, Prostate 35:285-296). Cancer cells are more sensitive to decreased ATP production 
and have evolved to prevent zinc accumulation. Without wishing to be bound by theory, the 
up-regulation of ZNT4 in prostate cancer cells may result in protection of the cells from high 
zinc levels by its ability to pump accumulated zinc out of the cells. 

The present invention also relates to nucleic acid sequencess encoding PBHL 

10 PBH1 is related to human TRPC7 (transient receptor potential-related channels, NPJ)03298), 
a putative calcium channel highly expressed in brain (Nagamine et al., Genomics 54:124-131 
(1998)). Trp is related to melastatin, a gene down-regulated in metastatic melanomas 
(Duncan et aL, Cancer Res. 58:1515-1520 (1998)), and MTR1, a gene locallized to within the 
Beckwith- Wiedemann syndromeAVilm's tumor susceptability region (Prawitt et al., Hum. 

15 Mol. Genet 9:203-216 (2000)). Without wishing to be bound by theory, it is believed that 
PBH1 functions as a calcium channel. 

As a calcium channel, PBH1 is an ideal target for a small molecule 
therapeutic, or a therapeutic antibody that disrupts channel function. CD20, the target of 
Rituximab in non-Hodgekin's lymphoma (Maloney et al., Blood 90:2188-2195 (1997); Leget 

20 and Czuczman, Curr. Opin. Oncol. 10:548-551 (1998)), is a plasma membrane calcium 
channel expressed in B cells (Tedder andEngel, Immunol. Today 15:450-454 (1994)). 
Similarly, a small molecule, or antibody that inhibits or alters a calcium signal mediated by 
PBH1, will result in the death of prostate cancer cells. 

PBH1, and other genes of the invention, Bre also be useful as targets for 

25 cytotoxic T-Iymphocytes. Genes that are tumor specific, or that are expressed in immune- 
privileged organs, are currently being used as potential vaccine targets (Van den Eynde and 
Boon, Lit J. Clin. Lab. Res. 27:81-86 (1997)). The expression pattern of PBH1 indicates that 
it is an ideal target for cytotoxic T-lymphocytes. Thus, therapies that utilize PBHl-specific 
cytotoxic T-lymphocytes to induce prostate cancer cell death are also provided by this 

30 invention. See, e.g., U.S. Patent No. 6,051,227 and WO 00/32231, the disclosures of which 
are herein incorporated by reference. 
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The present invention is also related to the identification of PAA3 as a gene 
that is important in the modulation of prostate cancer and or breast cancer. 

Tables 1-16 provide unigene cluster identification numbers, exemplar 
accession numbers, or genomic nucleotide position numbers for the nucleotide sequence of 
5 genes that exhibit increased or decreased expression in prostate cancer samples. 



Definitions 

The term prostate cancer protein" or "prostate cancer polynucleotide" or 
"prostate cancer-associated transcript" refers to nucleic acid and polypeptide polymorphic 

10 variants, alleles, mutants, and interspecies homologues that: (1) have a nucleotide sequence 
that has greater than about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 
90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater nucleotide 
sequence identity, preferably over a region of over a region of at least about 25, 50, 100, 200, 
500, 1000, or more nucleotides, to a nucleotide sequence of or associated with a unigene 

15 cluster of Tables 1-16; (2) bind to antibodies, e.g., polyclonal antibodies, raised against an 
immunogen comprising an amino acid sequence encoded by a nucleotide sequence of or 
associated with a unigene cluster of Tables 1-16, and conservatively modified variants 
thereof; (3) specifically hybridize under stringent hybridization conditions to a nucleic acid 
sequence, or the complement thereof of Tables 1-16 and conservatively modified variants 

20 thereof or (4) have an amino acid sequence that has greater than about 60% amino acid 

sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 
96%, 97%, 98% or 99% or greater amino sequence identity, preferably over a region of over 
a region of at least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid 
sequence encoded by a nucleotide sequence of or associated with a unigene cluster of Tables 

25 1-16. A polynucleotide or polypeptide sequence is typically from a mammal including, but 
not limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, 
or other mammal. A "prostate cancer polypeptide" and a "prostate cancer polynucleotide/' 
include both naturally occurring or recombinant forms. 

A "full length" prostate cancer protein or nucleic acid refers to a prostate 

30 cancer polypeptide or polynucleotide sequence, or a variant thereof, that contains all of the 
elements normally contained in one or more naturally occurring, wild type prostate cancer 

10 
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polynucleotide or polypeptide sequences. For example, a full length prostate cancer nucleic 
acid will typically comprise all of the exons that encode for the full length, naturally ocurring 
protein. The 'full length" may be prior to, or after, various stages of post-translation 
processing or splicing, including alternative splicing. 
5 'Biological sample" as used herein is a sample of biological tissue or fluid that 

contains nucleic acids or polypeptides, e.g., of a prostate cancer protein, polynucleotide or 
transcript Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

10 blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. Biological samples also 
include explants and primary and/or transformed cell cultures derived from patient tissues. A 
biological sample is typically obtained from a eukaryotic organism, most preferably a 
mammal such as a primate e.g., chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea 
pig, rat, mouse; rabbit; or a bird; reptile; or fish. 

15 "Providing a biological sample" means to obtain a biological sample for use in 

methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g.> 
isolated by another person, at another time, and/or for another purpose), or by performing the 
methods of the invention in vivo. Archival tissues, having treatment or outcome history, will 

20 be particularly useful. 

The terms "identical" or percent "identity," in the context of two or more 
nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that 
are the same or have a specified percentage of amino acid residues or nucleotides that are the 
same (i.e., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 

25 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aligned for maximum correspondence over a comparison window or designated region) as 
measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http^/www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to 

30 be "substantially identical." This definition also refers to, or may be applied to, the 

compliment of a test sequence. The definition also includes sequences that have deletions 

11 



WO 02/30268 



PCT/US01/32045 



and/or additions, as well as those that have substitutions, as well as naturally occurring, e.g., 
polymorphic or allelic variants, and man-made variants. As described below, the preferred 
algorithms can account for gaps and the like. Preferably, identity exists over a region that is 
at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 
5 50-100 amino acids or nucleotides in length. 

For sequence comparison, typically one sequence acts as a reference sequence, 
to which test sequences are compared When using a sequence comparison algorithm, test 
and reference sequences are entered into a computer, subsequence coordinates are designated, 
if necessary, and sequence algorithm program parameters are designated. Preferably, default 

10 program parameters can be used, or alternative parameters can be designated. The sequence 
comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of 
one of the number of contiguous positions selected from the group consisting typically of 

15 from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which 
a sequence may be compared to a reference sequence of the same number of contiguous 
positions after the two sequences are optimally aligned. Methods of alignment of sequences 
for comparison are well-known in the art Optimal alignment of sequences for comparison 
can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. 

20 Math. 2:482 (198 1), by the homology alignment algorithm of Needleman & Wunsch, J. Mol 
Biol 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l. 
Acad Set USA 85:2444 (1988), by computerized implementations of these algorithms 
(GAP, BESTFTT, PASTA, and TFASTA in the Wisconsin Genetics Software Package, 
Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and 

25 visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et aL , eds. 
1995 supplement)). 

Preferred examples of algorithms that are suitable for determining percent 
sequence identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, 
which are described in Altschul et dL, Nuc. Acids Res. 25:3389-3402 (1977) and Altschul et 

30 al, J. Mol. Biol 215:403-410 (1990). BLAST and BLAST 2.0 are used, with the parameters 
described herein, to determine percent sequence identity for the nucleic acids and proteins of 
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the invention. Software for performing BLAST analyses is publicly available through the 
National Center for Biotechnology Information (http^/www.ncbi.nlm.nih.gov/). This 
algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
5 threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul et al., supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 
them. Hie word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 

10 for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 
hits in each direction are halted when: the cumulative alignment score falls off by the 
quantity X from its maximum achieved value; the cumulative score goes to zero or below, 

15 due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 
uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 

20 defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff & Henikoff, Proc. Natl Acad Sci. USA 89:10915 (1989)) alignments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

The BLAST algorithm also performs a statistical analysis of the similarity 
between two sequences (see, e.g., Karlin & Altschul, Ptoc. Nat'L Acad. Set USA 90:5873- 

25 5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest 
sum probability (P(N)), which provides an indication of the probability by which a match 
between two nucleotide or amino acid sequences would occur by chance. For example, a 
nucleic acid is considered similar to a reference sequence if the smallest sum probability in a 
comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more 

30 preferably less than about 0.01, and most preferably less than about 0.001. Log values may 
be large negative numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 1 10, 150, 170, etc. 
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An indication that two nucleic acid sequences or polypeptides are substantially 
identical is that the polypeptide encoded by the first nucleic acid is immunologically cross 
reactive with the antibodies raised against the polypeptide encoded by the second nucleic 
acid, as described below. Thus, a polypeptide is typically substantially identical to a second 
5 polypeptide, e.g., where the two peptides differ only by conservative substitutions. Another 
indication that two nucleic acid sequences are substantially identical is that the two molecules 
or their complements hybridize to each other under stringent conditions, as described below. 
Yet another indication that two nucleic acid sequences are substantially identical is that the 
same primers can be used to amplify the sequences. 

10 A "host cell" is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 

15 Collection catalog or web site, www.atcc.org). 

The terms "isolated," "purified," or "biologically pure" refer to material that is 
substantially or essentially free from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 

20 chromatography. A protein or nucleic acid that is the predominant species present in a 

preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 

25 that the nucleic acid or protein is at least 85% pure, more preferably at least 95% pure, and 
most preferably at least 99% pure. "Purify" or "purification" in other embodiments means 
removing at least one contaminant from the composition to be purified. In this sense, 
purification does not require that the purified compound be homogenous, e.g., 100% pure. 

The terms polypeptide," peptide" and "protein" are used interchangeably 

30 herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers 
in which one or more amino acid residue is an artificial chemical mimetic of a corresponding 
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naturally occurring amino acid, as well as to naturally occurring amino acid polymers, those 
containing modified residues, and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, 
as well as amino acid analogs and amino acid mimetics that function similarly to the naturally 
5 occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 

10 norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 

modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic 
chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to 
chemical compounds that have a structure that is different from the general chemical 
structure of an amino acid, but that functions similarly to a naturally occurring amino acid. 

15 Amino acids may be referred to herein by either their commonly known three 

letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic 

20 acid sequence?. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 

25 most proteins. For instance, the codons GCA, GCC, GCG and GCU all encode the amino 
acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can 
be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
conservatively modified variations. Every nucleic acid sequence herein which encodes a 

30 polypeptide also describes silent variations of the nucleic acid One of skill will recognize 
that in certain contexts each codon in a nucleic acid (except AUG, which is ordinarily the 
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only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can 
be modified to yield a functionally identical molecule. Accordingly, often silent variations of 
a nucleic acid which encodes a polypeptide is implicit in a described sequence with respect to 
the expression product, but not with respect to actual probe sequences. 
5 As to amino acid sequences, one of skill will recognize that individual 

substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 
sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 

10 Conservative substitution tables providing functionally similar amino acids are well known in 
the art Such conservatively modified variants are in addition to and do not exclude 
polymorphic variants, interspecies homologs, and alleles of the invention.typicaHy 
conservative substitutions for one another 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), 
Glutamic acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) 

15 Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), 
Tryptophan (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) {see, 
e.g., Creighton, Proteins (1984)). 

Macromolecular structures such as polypeptide structures can be described in 
terms of various levels of organization. For a general discussion of this organization, see, 

20 eg., Alberts et al , Molecular Biology of the Cell (3 rd ed., 1994) and Cantor & Schimmel, 
Biophysical Chemistry Part I: The Conformation of Biological Macromolecules (1980). 
"Primary structure" refers to the amino acid sequence of a particular peptide. "Secondary 
structure" refers to locally ordered, three dimensional structures within a polypeptide. These 
structures are commonly known as domains. Domains are portions of a polypeptide that 

25 often form a compact unit of the polypeptide and are typically 25 to approximately 500 
amino acids long. Typical domains are made up of sections of lesser organization such as 
stretches of P-sheet and a-helices. 'Tertiary structure" refers to the complete three 
dimensional structure of a polypeptide monomer. "Quaternary structure" refers to the three 
dimensional structure formed, usually by the noncovalent association of independent tertiary 

30 units. Anisotropic terms are also known as energy terms. 
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<c Nucleic acid" or "oligonucleotide" or "polynucleotide" or grammatical 
equivalents used herein means at least two nucleotides covalently linked together. 
Oligonucleotides are typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more 
nucleotides in length, up to about 100 nucleotides in length. Nucleic acids and 
5 polynucleotides are a polymers of any length, including longer lengths, e.g., 200, 300, 500, 
1000, 2000, 3000, 5000, 7000, 10,000, etc. A nucleic acid of the present invention will 
generally contain phosphodiester bonds, although in some cases, nucleic acid analogs are 
included that may have alternate backbones, comprising, e.g., phosphoramidate, 
phosphorothioate, phosphorodithioate, or O-methylphophoroamidite linkages (see Eckstein, 

10 Oligonucleotides and Analogues: A Practical Approach, Oxford University Press); and 

peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with 
positive backbones; non-ionic backbones, and non-ribose backbones, including those 
described in U.S. Patent Nos. 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC 
Symposium Series 580, Carbohydrate Modifications in Antisense Research, Sanghui & 

15 Cook, eds.. Nucleic acids containing one or more carbocyclic sugars are also included within 
one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done 
for a variety of reasons, e.g. to increase the stability and half-life of such molecules in 
physiological environments or as probes on a biochip. Mixtures of naturally occurring 
nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid 

20 analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. 

A variety of references disclose such nucleic acid analogs, including, for 
example, phosphoramidate (Beaucage et al., Tetrahedron 49(10): 1925 (1993) and references 
therein; Letsinger, J. Org. Chem. 35:3800 (1970); Sprinzl et al., Eur. J. Biochem. 81:579 
(1977); Letsinger et al., Nucl. Acids Res. 14:3487 (1986); Sawai et al, Chem. Lett 805 

25 (1984), Letsinger et al., J. Am. Chem. Soc. 1 10:4470 (1988); and Pauwels et al., Chemica 
Scripta 26:141 91986)), phosphorothioate (Mag et al., Nucleic Acids Res. 19:1437 (1991); 
and U.S. Patent No. 5,644,048), phosphorodithioate (Briu et al., J. Am. Chem. Soc. 111:2321 
(1989), O-methylphophoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: 
A Practical Approach, Oxford University Press), and peptide nucleic acid backbones and 

30 linkages (see Egholm, J. Am. Chem. Soc. 1 14: 1895 (1992); Meier et al., Chem. Int. Ed. Engl. 
31:1008 (1992); Nielsen, Nature, 365:566 (1993); Carlsson et al., Nature 380:207 (1996), all 
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of which are incorporated by reference). Other analog nucleic acids include those with 
positive backbones (Denpcy et al M Proc. Natl. Acad Sci. USA 92:6097 (1995); non-ionic 
backbones (U.S. Patent Nos. 5,386,023, 5,637,684, 5,602,240, 5,216,141 and 4,469,863; 
Kiedrowshi et aL, Angew. Chem. Intl. Ed. English 30:423 (1991); Letsinger et al., J. Am. 
5 Chem. Soc. 110:4470 (1988); Letsinger et al., Nucleoside & Nucleotide 13:1597 (1994); 
Chapters 2 and 3, ASC Symposium Series 580, "Carbohydrate Modifications in Antisense 
Research", Ed. Y.S. Sanghui and P. Dan Cook; Mesmaeker et al., Bioorganic & Medicinal 
Chem. Lett 4:395 (1994); Jeffs et al., J. BiomolecularNMR 34:17 (1994); Tetrahedron Lett 
37:743 (1996)) and non-ribose backbones, including those described in U.S. Patent Nos. 

10 5,235,033 and 5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, "Carbohydrate 
Modifications in Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook. Nucleic acids 
containing one or more carbocyclic sugars are also included within one definition of nucleic 
acids (see Jenkins etal., Chem. Soc. Rev. (1995) pp 169-176). Several nucleic acid analogs 
are described in Rawls, C & E News June 2, 1997 page 35. All of these references are hereby 

15 expressly incorporated by reference. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide 
nucleic acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 

20 kinetics. PNAs have larger changes in the melting temperature (Tm) for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4°C drop in T m for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9°C. Similarly, 
due to their non-ionic nature, hybridization of the bases attached to these backbones is 
relatively insensitive to salt concentration. In addition, PNAs are not degraded by cellular 

25 enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or 
contain portions of both double stranded or single stranded sequence. As will be appreciated 
by those in the art, the depiction of a single strand also defines the sequence of the 
complementary strand; thus the sequences described herein also provide the complement of 

30 the sequence. Hie nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, 
where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and 
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combinations of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, 
xanthine hypoxanthine, isocytosine, isoguanine, etc. 'Transcript" typically refers to a 
naturally occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA. As used herein, the term 
"nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 
5 nucleosides such as amino modified nucleosides. In addition, "nucleoside" includes non- 
naturally occurring analog structures. Thus, e.g. the individual units of a peptide nucleic 
acid, each containing a base, are referred to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by 
spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical 

10 means. For example, useful labels include fluorescent dyes, electron-dense reagents, enzymes 
(e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins or other 
entities which can be made detectable, e.g., by incorporating a radiolabel into the peptide or 
used to detect antibodies specifically reactive with the peptide. The radioisotope may be, for 
example, 3H, 14C, 32P, 35S, or 125L In some cases, particularly using antibodies against the 

15 proteins of the invention, the radioisotopes are used as toxic moieties, as described below. 
The labels may be incorporated into the prostate cancer nucleic acids, proteins and antibodies 
at any position. Any method known in the art for conjugating the antibody to the label may 
be employed, including those methods described by Hunter et al., Nature . 144:945 (1962); 
David et al., Biochemistry. 13:1014 (1974); Pain et al., J Tmmnnnl. Meth., 40:219 (1981); 

20 and Nygren, J. BBstochem. and Cvtochem.. 30:407 (1982). The lifetime of radiolabeled 
peptides or radiolabeled antibody compositions may extended by the addition of substances 
that stablize the radiolabeled peptide or antibody and protect it from degradation. Any 
substance or combination of substances that stablize the radiolabeled peptide or antibody may 
be used including those substances disclosed in US Patent No. 5,961,955. 

25 An "effector" or "effector moiety" or "effector component" is a molecule that 

is bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 
The "effector" can be a variety of molecules including, e.g., detection moieties including 
radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 

30 tags* a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting "hard" e.g., beta radiation. 
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A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 
covalently, through a linker or a chemical bond, or noncovalently , through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 
5 using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a 
nucleic acid capable of binding to a target nucleic acid of complementary sequence through 
one or more types of chemical bonds, usually through complementary base pairing, usually 

10 through hydrogen bond formation. As used herein, a probe may include natural (i.e., A, G, C, 
or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe 
may be joined by a linkage other than a phosphodiester bond, so long as it does not 
functionally interfere with hybridization. Thus, e.g., probes may be peptide nucleic acids in 
which the constituent bases are joined by peptide bonds rather than phosphodiester linkages. 

15 It will be understood by one of skill in the art that probes may bind target sequences lacking 
complete complementarity with the probe sequence depending upon the stringency of the 
hybridization conditions. The probes are preferably directly labeled as with isotopes, 
chromophores, lumiphores, chromogens, or indirectly labeled such as with biotin to which a 
streptavidin complex may later bind. By assaying for the presence or absence of the probe, 

20 one can detect the presence or absence of the select sequence or subsequence. Diagnosis or 
prognosis may be based at the genomic level, or at the level of RNA or protein expression. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic 
acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been 
modified by the introduction of a heterologous nucleic acid or protein or the alteration of a 

25 native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, e.g., 
recombinant cells express genes that are not found within the native (non-recombinant) form 
of the cell or express native genes that are otherwise abnormally expressed, under expressed 
or not expressed at all. By the term '^recombinant nucleic acid" herein is meant nucleic acid, 
originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 

30 polymerases and endonucleases, in a form not normally found in nature. In this manner, 

operably linkage of different sequences is achieved Thus an isolated nucleic acid, in a linear 
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form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the 
5 host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
recombinant for the purposes of the invention. Similarly, a "recombinant protein" is a protein 
made using recombinant techniques, i.e., through the expression of a recombinant nucleic 
acid as depicted above. 

10 Hie term "heterologous" when used with reference to portions of a nucleic 

acid indicates that the nucleic acid comprises two or more subsequences that are not normally 
found in the same relationship to each other in nature. For instance, the nucleic acid is 
typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 
arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 

15 coding region from another source. Similarly, a heterologous protein will often refer to two 
or more subsequences that are not found in the same relationship to each other in nature (e.g., 
a fusion protein). 

A "promoter" is defined as an array of nucleic acid control sequences that 
direct transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic 

20 acid sequences near the start site of transcription, such as, in the case of a polymerase II type 
promoter, a TATA element A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 

25 active under environmental or developmental regulation. The term "operably linked" refers 
to a functional linkage between a nucleic acid expression control sequence (such as a 
promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

30 An "expression vector" is a nucleic acid construct, generated recombinantly or 

synthetically, with a series of specified nucleic acid elements that permit transcription of a 
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particular nucleic acid in a host cell. Hie expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed operably linked to a promoter. 

Hie phrase "selectively (or specifically) hybridizes to" refers to the binding, 
5 duplexing, or hybridizing of a molecule only to a particular nucleotide sequence that is 
determinative of the presence of the nucleotide sequence, in a heterogeneous population of 
nucleic acids and other biologies (e.g., total cellular or library DNA or RNA). Similarly, the 
phrase "specifically (or selectively) binds" to an antibody or "specifically (or selectively) 
immunoreactive with," when referring to a protein or peptide, refers to a binding reaction that 
10 is determinative of the presence of the protein, in a heterogeneous population of proteins and 
other biologies. Thus, under designated immunoassay or nucleic acid hybridization 
conditions, the specified antibodies or nucleic acid probes bind to a particular protein 
nucleotide sequences at least two times the background and more typically more than 10 to 
100 times background. 

15 Specific binding to an antibody under such conditions requires an antibody 

that is selected for its specificity for a particular protein. For example, polyclonal antibodies 
raised to a particular protein, polymorphic variants, alleles, orthologs, and conservatively 
modified variants, or splice variants, or portions thereof, can be selected to obtain only those 
polyclonal antibodies that are specifically immunoreactive with the desired prostact cancer 

20 protein and not with other proteins. This selection may be achieved by subtracting out 

antibodies that cross-react with other molecules. A variety of immunoassay formats may be 
used to select antibodies specifically immunoreactive with a particular protein. For example, 
solid-phase ELISA immunoassays are routinely used to select antibodies specifically 
immunoreactive with a protein (see, e.g., Harlow & Lane, Antibodies, A Laboratory Manual 

25 (1988) for a description of immunoassay formats and conditions that can be used to 
determine specific immunoreactivity). 

The phrase "stringent hybridization conditions" refers to conditions under 
which a probe will hybridize to its target subsequence, typically in a complex mixture of 
nucleic acids, but to no other sequences. Stringent conditions are sequence-dependent and 

30 will be different in different circumstances. Longer sequences hybridize specifically at 
higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 
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Tijssen, Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic 
Probes, "Overview of principles of hybridization and the strategy of nucleic acid assays" 
(1993). Generally, stringent conditions are selected to be about 5-10°C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The T m is 
5 the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% 
of the probes complementary to the target hybridize to the target sequence at equilibrium (as 
the target sequences are present in excess, at T m , 50% of the probes are occupied at 
equilibrium). Stringent conditions will be those in which the salt concentration is less than 
about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other 

10 salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 
50 nucleotides) and at least about 60°C for long probes (e.g., greater than 50 nucleotides). 
Stringent conditions may also be achieved with the addition of destabilizing agents such as 
fonnamide. For selective or specific hybridization, a positive signal is at least two times 
backgroundj preferably 10 times background hybridization. Exemplary stringent 

15 hybridization conditions can be as following: 50% fonnamide, 5x SSC, and 1% SDS, 
incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°C, with wash in 0.2x SSC, and 
0.1% SDS at 65°C. For PCR, a temperature of about 36°C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32°C and 48°C 
depending on primer length* For high stringency PCR amplification, a temperature of about 

20 62°C is typical, although high stringency annealing temperatures can range from about 50°C 
to about 65°C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90°C - 95°C for 
30 sec - 2 min., an annealing phase lasting 30 sec. - 2 min., and an extension phase of about 
72°C for 1 - 2 min. Protocols and guidelines for low and high stringency amplification 

25 reactions are provided, e.g., in Innis et al. (1990) PCR Protocols, A Guide to Methods and 
Applications, Academic Press, Inc. N. Y.). 

Nucleic acids that do not hybridize to each other under stringent conditions are 
still substantially identical if the polypeptides which they encode are substantially identical. 
This occurs, e.g., when a copy of a nucleic acid is created using the maxim um codon 

30 degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybridize 
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under moderately stringent hybridization conditions. Exemplary "moderately stringent 
hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 
1% SDS at 37°C, and a wash in IX SSC at 45°C. A positive hybridization is at least twice 
background Those of ordinary skill will readily recognize that alternative hybridization and 

5 wash conditions can be utilized to provide conditions of similar stringency. Additional 

guidelines for determining hybridization parameters are provided in numerous reference, e.g., 
and Current Protocols in Molecular Biology, ed. Ausubel, et aL 

The phrase "functional effects" in the context of assays for testing compounds 
that modulate activity of a prostate cancer protein includes the determination of a parameter 

10 that is indirectly or directly under the influence of the prostate cancer protein or nucleic acid, 
e.g., a functional, physical, or chemical effect, such as the ability to decrease prostate cancer. 
It includes ligand binding activity; cell growth on soft agar, anchorage dependence; contact 
inhibition and density limitation of growth; cellular proliferation; cellular transformation; 
growth factor or serum dependence; tumor specific marker levels; invasiveness into Matrigel; 

15 tumor growth and metastasis in vivo; mRNA and protein expression in cells undergoing 
metastasis, and other characteristics of prostate cancer cells. "Functional effects" include in 
vitro* in vivo* and ex vivo activities. 

By "determining the functional effect" is meant assaying for a compound that 
increases or decreases a parameter that is indirectly or directly under the influence of a 

20 prostate cancer protein sequence, e.g., functional, enzymatic, physical and chemical effects. 
Such functional effects can be measured by any means known to those skilled in the art, e.g., 
changes in spectroscopic characteristics (e.g., fluorescence, absorbance, refractive index), 
hydrodynamic (e.g., shape), chromatographic, or solubility properties for the protein, 
measuring inducible markers or transcriptional activation of the prostate cancer protein; 

25 measuring binding activity or binding assays, e.g. binding to antibodies or other ligands, and 
measuring cellular proliferation. Determination of the functional effect of a compound on 
prostate cancer can also be performed using prostate cancer assays known to those of skill in 
the art such as an in vitro assays, e.g., cell growth on soft agar; anchorage dependence; 
contact inhibition and density limitation of growth; cellular proliferation; cellular 

30 transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo\ mRNA and protein 
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expression in cells undergoing metastasis, and other characteristics of prostate cancer cells. 
The functional effects can be evaluated by many means known to those skilled in the art, e.g., 
microscopy for quantitative or qualitative measures of alterations in morphological features, 
measurement of changes in RNA or protein levels for prostate cancer-associated sequences, 
5 measurement of RNA stability, identification of downstream or reporter gene expression 
(CAT, luciferase, p-gal, GFP and the like), e.g., via chemiluminescence, fluorescence, 
colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

'Inhibitors", "activators", and "modulators" of prostate cancer polynucleotide 
and polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules 

10 or compounds identified using in vitro and in vivo assays of prostate cancer polynucleotide 
and polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally 
block activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate 
the activity or expression of prostate cancer proteins, e.g., antagonists. Antisense nucleic 
acids may seem to inhibit expression and subsequent function of the protein. "Activators" 

15 are compounds that increase, open, activate, facilitate, enhance activation, sensitize, agonize, 
or up regulate prostate cancer protein activity. Inhibitors, activators, or modulators also 
include genetically modified versions of prostate cancer proteins, e.g., versions with altered 
activity, as well as naturally occurring and synthetic ligands, antagonists, agonists, antibodies, 
small chemical molecules and the like. Such assays for inhibitors and activators include, e.g., 

20 expressing the prostate cancer protein in vitro, in cells, or cell membranes, applying putative 
modulator compounds, and then determining the functional effects on activity, as described 
above. Activators and inhibitors of prostate cancer can also be identified by incubating 
prostate cancer cells with the test compound and determining increases or decreases in the 
expression of 1 or more prostate cancer proteins, e.g., 1;2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50 

25 or more prostate cancer proteins, such as prostate cancer proteins encoded by the sequences 
set out in Tables 1-16. 

Samples or assays comprising prostate cancer proteins that are treated with a 
potential activator, inhibitor, or modulator are compared to control samples without the 
inhibitor, activator, or modulator to examine the extent of inhibition. Control samples 

30 (untreated with inhibitors) are assigned a relative protein activity value of 100%. Inhibition 
of a polypeptide is achieved when the activity value relative to the control is about 80%, 
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preferably 50%, more preferably 25-0%. Activation of a prostate cancer polypeptide is 
achieved when the activity value relative to the control (untreated with activators) is 110%, 
more preferably 150%, more preferably 200-500% (i.e., two to five fold higher relative to the 
control), more preferably 1000-3000% higher. 
5 The phrase "changes in cell growth" refers to any change in cell growth and 

proliferation characteristics in vitro or in vivo, such as formation of foci, anchorage 
independence, semi-solid or soft agar growth, changes in contact inhibition and density 
limitation of growth, loss of growth factor or serum requirements, changes in cell 
morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 

10 ability to form or suppress tumors when injected into suitable animal hosts, and/or 

immortalization of the cell. See, e.g., Freshney, Culture of Animal Cells a Manual of Basic 
Technique pp. 231-241 (3 rd ed. 1994). 

"Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor. 
"Cancer cells," "transformed" cells or "transformation" in tissue culture, refers 

15 to spontaneous or induced phenotypic changes that do not necessarily involve the uptake of 
new genetic material. Although transformation can arise from infection with a transforming 
virus and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 
spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 

20 aberrant growth control, nonmorphological changes, and/or malignancy {see, Freshney, 
Culture of Animal Cells a Manual of Basic Technique (3 rd ed. 1994)). 

"Antibody" refers to a polypeptide comprising a framework region from an 
immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 

25 epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. light chains are classified as either kappa or lambda. Heavy chains are classified as 
gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its functional equivalent will be most critical in specificity and affinity of binding. See Paul, 
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An exemplary immunoglobulin (antibody) structural unit comprises a 
tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair 
having one "light" (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus 
of each chain defines a variable region of about 100 to 1 10 or more amino acids primarily 
5 responsible for antigen recognition. Hie terms variable light chain (Vl) and variable heavy 
chain (Vh) refer to these light and heavy chains respectively. 

Antibodies exist, e.g., as intact immunoglobulins or as a number of well- 
characterized fragments produced by digestion with various peptidases. Thus, e.g., pepsin 
digests an antibody below the disulfide linkages in the hinge region to produce F(ab)'2, a 

10 dimerofFab which itself is a light chain joined to V h -Ch1 by a disulfide bond. TheF(ab)'2 
may be reduced under mild conditions to break the disulfide linkage in the hinge region, 
thereby converting the F(ab) , 2 dimer into an Fab' monomer. The Fab* monomer is 
essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 
1993). While various antibody fragments are defined in terms of the digestion of an intact 

15 antibody, one of skill will appreciate that such fragments may be synthesized de novo either 
chemically or by using recombinant DNA methodology. Thus, the term antibody, as used 
herein, also includes antibody fragments either produced by the modification of whole 
antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single 
chain Fv) or those identified using phage display libraries (see, e.g„ McCafferty et cd„ Nature 

20 348:552-554(1990)) 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler & Milstein, 
Nature 256:495-497 (1975); Kozbor et aL 9 Immunology Today 4:72 (1983); Cole et al. 9 pp. 
77-96 in Monoclonal Antibodies and Cancer Therapy (1985); Coligan, Current Protocols in 

25 Immunology (199 1); Harlow & Lane, Antibodies, A Laboratory Manual (1988); and Goding, 
Monoclonal Antibodies: Principles and Practice (2d ed. 1986)). Techniques for the 
production of single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce 
antibodies to polypeptides of this invention. Also, transgenic mice, or other organisms such 
as other mammals, may be used to express humanized antibodies. Alternatively, phage 

30 display technology can be used to identify antibodies and heteromeric Fab fragments that 
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specifically bind to selected antigens {see, e.g„ McCafferty et al. y Nature 348:552-554 
(1990); Marks et aL 9 Biotechnology 10:779-783 (1992)). 

A "chimeric antibody" is an antibody molecule in which (a) the constant 
region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site 
5 (variable region) is linked to a constant region of a different or altered class, effector function 
and/or species, or an entirely different molecule which confers new properties to the chimeric 
antibody, e.g n an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable 
region, or a portion thereof, is altered, replaced or exchanged with a variable region having a 
different or altered antigen specificity. 

10 

Identification of prostate cancer-associated sequences 

In one aspect, the expression levels of genes are determined in different 
patient samples for which diagnosis information is desired, to provide expression profiles. 
An expression profile of a particular sample is essentially a 'fingerprint" of the state of the 

15 sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 
characteristic of the state of the cell. That is, normal tissue (e.g., normal prostate or other 
tissue) may be distinguished from cancerous or metastatic cancerous tissue of the prostate, or 
prostate cancer tissue or metastatic prostate cancerous tissue can be compared with tissue 

20 samples of prostate and other tissues from surviving cancer patients. By comparing 

expression profiles of tissue in known different prostate cancer states, information regarding 
which genes are important (including both up- and down-regulation of genes) in each of these 
states is obtained. 

The identification of sequences that are differentially expressed in prostate 
25 cancer versus non-prostate cancer tissue allows the use of this information in a number of 

ways. For example, a particular treatment regime may be evaluated: does a chemotherapeutic 
drug act to down-regulate prostate cancer, and thus tumor growth or recurrence, in a 
particular patient Similarly, diagnosis and treatment outcomes may be done or confirmed by 
comparing patient samples with the known expression profiles. Metastatic tissue can also be 
30 analyzed to determine the stage of prostate cancer in the tissue. Furthermore, these gene 
expression profiles (or individual genes) allow screening of drug candidates with an eye to 
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mimicking or altering a particular expression profile; e.g., screening can be done for drugs 
that suppress the prostate cancer expression profile. This may be done by making biochips 
comprising sets of the important prostate cancer genes, which can then be used in these 
screens. These methods can also be done on the protein basis; that is, protein expression 
5 levels of the prostate cancer proteins can be evaluated for diagnostic purposes or to screen 
candidate agents. In addition, the prostate cancer nucleic acid sequences can be administered 
for gene therapy purposes, including the administration of antisense nucleic acids, or the 
prostate cancer proteins (including antibodies and other modulators thereof) administered as 
therapeutic drugs. 

10 Thus the present invention provides nucleic acid and protein sequences that 

are differentially expressed in prostate cancer, herein termed "prostate cancer sequences." As 
outlined below, prostate cancer sequences include those that are up-regulated (i.e., expressed 
at a higher level) in prostate cancer, as well as those that are down-regulated (i.e., expressed 
at a lower level). In a preferred embodiment, the prostate cancer sequences are from humans; 

15 however, as will be appreciated by those in the art, prostate cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, other 
prostate cancer sequences are provided, from vertebrates, including mammals, including 
rodents (rats, mice, hamsters, guinea pigs, etc.), primates, farm animals (including sheep, 
goats, pigs, cows, horses, etc.) and pets, e.g., (dogs, cats, etc.). Prostate cancer sequences 

20 from other organisms may be obtained using the techniques outlined below. 

Prostate cancer sequences can include both nucleic acid and amino acid 
sequences. As will be appreciated by those in the art and is more fully outlined below, 
prostate cancer nucleic acid sequences are useful in a variety of applications, including 
diagnostic applications, which will detect naturally occurring nucleic acids, as well as 

25 screening applications; e.g., biochips comprising nucleic acid probes or PCR microtiter plates 
with selected probes to the prostate cancer sequences can be generated. 

A prostate cancer sequence can be initially identified by substantial nucleic 
acid and/or amino acid sequence homology to the prostate cancer sequences outlined herein. 
Such homology can be based upon the overall nucleic acid or amino acid sequence, and is 

30 generally determined as outlined below, using either homology programs or hybridization 
conditions. 

29 



WO 02/30268 



PCTYUS01/32045 



For identifying prostate cancer-associated sequences, the prostate cancer 
screen typically includes comparing genes identified in different tissues, e.g., normal and 
cancerous tissues, or tumor tissue samples from patients who have metastatic disease vs. non 
metastatic tissue. Other suitable tissue comparisons include comparing prostate cancer 
5 samples with metastatic cancer samples from other cancers, such as lung, breast, 

gastrointestinal cancers, ovarian, etc. Samples of different stages of prostate cancer, e.g., 
survivor tissue, drug resistant states, and tissue undergoing metastasis, are applied to biochips 
comprising nucleic acid probes. The samples are first microdissected, if applicable, and 
treated as is known in the art for the preparation of mRNA. Suitable biochips are 

10 commercially available, e.g. from Affymetrix. Gene expression profiles as described herein 
are generated and the data analyzed. 

In one embodiment, the genes showing changes in expression as between 
normal and disease states are compared to genes expressed in other normal tissues, preferably 
normal prostate, but also including, and not limited to lung, heart, brain, liver, breast, kidney, 

15 muscle, colon, small intestine, large intestine, spleen, bone and placenta. In a preferred 
embodiment, those genes identified during the prostate cancer screen that are expressed in 
. any significant amount in other tissues are removed from the profile, although in some 
embodiments, this is not necessary. That is, when screening for drugs, it is usually preferable 
that the target be disease specific, to minimize possible side effects. 

20 In a preferred embodiment, prostate cancer sequences are those that are up- 

regulated in prostate cancer, that is, the expression of these genes is higher in the prostate 
cancer tissue as compared to non-cancerous tissue. "Up-regulation" as used herein often 
means at least about a two-fold change, preferably at least about a three fold change, with at 
least about five-fold or higher being preferred. All unigene cluster identification numbers 

25 and accession numbers herein are for the GenBank sequence database and the sequences of 
the accession numbers are hereby expressly incorporated by reference. GenBank is known in 
the art, see, e.g. 9 Benson, DA, et aL % Nucleic Acids Research 26:1-7 (1998) and 
http://www.ncbi.nlnunih.gov/. Sequences are also available in other databases, e.g., 
European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). 

30 In another preferred embodiment, prostate cancer sequences are those that are 

down-regulated in prostate cancer, that is, the expression of these genes is lower in prostate 
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cancer tissue as compared to non-cancerous tissue (see, e.g., Tables 8, 12 and 14). 'Down- 
regulation" as used herein often means at least about a 1.5-fold change more preferrably a 
two-fold change, preferably at least about a three fold change, with at least about five-fold or 
higher being most preferred 



Informatics 

The ability to identify genes that are over or under expressed in prostate 
cancer can additionally provide high-resolution, high-sensitivity datasets which can be used 

10 in the areas of diagnostics, therapeutics, drug development, pharmacogenetics, protein 
structure, biosensor development, and other related areas. For example, the expression 
profiles can be used in diagnostic or prognostic evaluation of patients with prostate cancer. 
Or as another example, subcellular toxicological information can be generated to better direct 
drug structure and activity correlation (see Anderson, Pharmaceutical Proteomics: Targets, 

15 MecJianism, and Function, paper presented at the IBC Proteomics conference, Coronado, CA 
(June 1 1-12, 1998)), Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,811,231). Similar advantages accrue 
from datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, 

20 saccharides, lipids, drugs, and the like). 

Thus, in another embodiment, the present invention provides a database that 
includes at least one set of assay data. The data contained in die database is acquired, e.g., 
using array analysis either singly or in a library format The database can be in substantially 
any form in which data can be maintained and transmitted, but is preferably an electronic 

25 database. The electronic database of the invention can be maintained on any electronic 
device allowing for the storage of and access to the database, such as a personal computer, 
but is preferably distributed on a wide area network, such as the World Wide Web. 

The focus of the present section on databases that include peptide sequence 
data is for clarity of illustration only. It will be apparent to those of skill in the art that similar 

30 databases can be assembled for any assay data acquired using an assay of the invention. 
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The compositions and methods for identifying and/or quantitating the relative 
and/or absolute abundance of a variety of molecular and macromolecular species from a 
biological sample undergoing prostate cancer, i.e., the identification of prostate cancer- 
associated sequences described herein, provide an abundance of information, which can be 
5 correlated with pathological conditions, predisposition to disease, drug testing, therapeutic 
monitoring, gene-disease causal linkages, identification of correlates of immunity and 
physiological status, among others. Although the data generated from the assays of the 
invention is suited for manual review and analysis, in a preferred embodiment, prior data 
processing using high-speed computers is utilized. 

10 An array of methods for indexing and retrieving biomolecular information is 

known in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational 
database system for storing biomolecular sequence information in a manner that allows 
sequences to be catalogued and searched according to one or more protein function 
hierarchies. U.S. Patent 5,953,727 discloses a relational database having sequence records 

15 containing information in a format that allows a collection of partial-length DNA sequences 
to be catalogued arid searched according to association with one or more sequencing projects 
for obtaining full-length sequences from the collection of partial length sequences. U.S. 
Patent 5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 
sequence similar to a sequence data item in a gene database based on the degree of similarity 

20 between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 
dimensional database comprising a functionality for multi-dimensional data analysis 

25 described as on-line analytical processing (OLAP), which entails the consolidation of 

projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 
fields stored in a hierarchical topological map which can be viewed as a tree structure or as 

30 the merger of two or more such tree structures. 
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See also Mount et al, Bioinformatics (2001); Biological Sequence Analysis: 
Probabilistic Models of Proteins and Nucleic Acids (Duibin et al., eds., 1999); 
Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (Baxevanis & 
Oeullette eds., 1998)); Rashidi & Buehler, Bioinformatics: Basic Applications in Biological 
5 Science and Medicine (1999); Introduction to Computational Molecular Biology (Setubal et 
al., eds 1997); Bioinformatics: Methods and Protocols (Misener & Krawetz, eds, 2000); 
Bioinformatics: Sequence, Structure, and Databanks: A Practical Approach (Higgins & 
Taylor, eds., 2000); Brown, Bioinformatics: A Biologist's Guide to Biocomputing and the 
Internet (2001); Han & Kamber, Data Mining: Concepts and Techniques (2000); and 

10 Waterman, Introduction to Computational Biology: Maps, Sequences, and Genomes (1995). 

The present invention provides a computer database comprising a computer 
and software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 
with data specifying the source of the target-containing sample from which each sequence 
specificity record was obtained. 

15 In an exemplary embodiment, at least one of the sources of target-containing 

sample is from a control tissue sample known to be free of pathological disorders. In a 
variation, at least one of the sources is a known pathological tissue specimen, e.g., a 
neoplastic lesion or another tissue specimen to be analyzed for prostate cancer. In another 
variation, the assay records cross-tabulate one or more of the following parameters for each 

20 target species in a sample: (1) a unique identification code, which can include, e.g., a target 
molecular structure and/or characteristic separation coordinate (e.g., electrophoretic 
coordinates); (2) sample source; and (3) absolute and/or relative quantity of the target species 
present in the sample. 

The invention also provides for the storage and retrieval of a collection of 

25 target data in a computer data storage apparatus, which can include magnetic disks, optical 
disks, magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, 
magnetic bubble memory devices, and other data storage devices, including CPU registers 
and on-CPU data storage arrays. Typically, the target data records are stored as a bit pattern 
in an array of magnetic domains on a magnetizable medium or as an array of charge states or 

30 transistor gate states, such as an array of cells in a DRAM device (e.g., each cell comprised of 
a transistor and a charge storage area, which may be on the transistor). In one embodiment, 
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the invention provides such storage devices, and computer systems built therewith, 
comprising a bit pattern encoding a protein expression fingerprint record comprising unique 
identifiers for at least 10 target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides 
5 a method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 
or retrieved from a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTHT) and/or the comparison may 
10 be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM- 
compatible (DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format 
(e.g., Iinux, SunOS, Solaris, ATX, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or 
15 hard (fixed, Winchester) disk drive, comprising a bit pattern encoding data from an assay of 
the invention in a file format suitable for retrieval and processing in a computerized sequence 
analysis, comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing 
devices linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, 
20 ISDN line, wireless network, optical fiber, or other suitable signal transmission medium, 
whereby at least one network device (e.g., computer, disk array, etc.) comprises a pattern of 
magnetic domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM 
cells) composing a bit pattern encoding data acquired from an assay of the invention. 

The invention also provides a method for transmitting assay data that includes 
25 generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

In a preferred embodiment, the invention provides a computer system for 
30 comparing a query target to a database containing an array of data structures, such as an assay 
result obtained by the method of the invention, and ranking database targets based on the 
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degree of identity and gap weight to the target data. A central processor is preferably 
initialized to load and execute the computer program for alignment and/or comparison of the 
assay results. Data for a query target is entered into the central processor via an I/O device. 
Execution of the computer program results in the central processor retrieving the assay data 
5 from the data file, which comprises a binary description of an assay result. 

The target data or record and the computer program can be transferred to 
secondary memory, which is typically random access memory (e.g., DRAM, SRAM, 
SGRAM, or SDRAM). Targets are ranked according to the degree of correspondence 
between a selected assay characteristic (e.g., binding to a selected affinity moiety) and the 

10 same characteristic of the query target and results are output via an I/O device. For example, 
a central processor can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, 
PA-8000, SPARC, MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or 
public domain molecular biology software package (e.g., UWGCG Sequence Analysis 
Software, Darwin); a data file can be an optical or magnetic disk, a data server, a memory 

15 device (e.g., DRAM, SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, 
etc.); an I/O device can be a terminal comprising a video display and a keyboard, a modem, 
an ISDN terminal adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or 
other suitable I/O device. 

The invention also preferably provides the use of a computer system, such as 

20 that described above, which comprises: (1) a computer, (2) a stored bit pattern encoding a 
collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer, (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. ; 

Characteristics of prostate cancer-associated proteins 

Prostate cancer proteins of the present invention may be classified as secreted 
proteins, transmembrane proteins or intracellular proteins. In one embodiment, the prostate 
cancer protein is an intracellular protein. Intracellular proteins may be found in the 
30 cytoplasm and/or in the nucleus. Intracellular proteins are involved in all aspects of cellular 
function and replication (including, e.g., signaling pathways); aberrant expression of such 
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proteins often results in unregulated or disregulated cellular processes {see, e.g., Molecular 
Biology of the Cell (Alberts, ed., 3rd ed., 1994). For example, many intracellular proteins 
have enzymatic activity such as protein kinase activity, protein phosphatase activity, protease 
activity, nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins 
5 also serve as docking proteins that are involved in organizing complexes of proteins, or 
targeting proteins to various subcellular localizations, and are involved in maintaining the 
structural integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence 
in the proteins of one or more motifs for which defined functions have been attributed. In 

10 addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
domains, also bind tyrosine phosphorylated targets. SID domains bind to proline-rich 

15 targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 
few, have been shown to mediate protein-protein interactions. Some of these may also be 
involved in binding to phospholipids or other second messengers. As will be appreciated by 
one of ordinary skill in the art, these motifs can be identified on the basis of primary 
sequence; thus, an analysis of the sequence of proteins may provide insight into both the 

20 enzymatic potential of the molecule and/or molecules with which the protein may associate. 
One useful database is Pfam (protein families), which is a large collection of multiple 
sequence alignments and hidden Markov models covering many common protein domains. 
Versions are available via the internet from Washington University in St Louis, the Sanger 
Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman et al. 9 Nuc. 

25 Acids Res. 28:263-266 (2000); Sonnhammer et al., Proteins 28:405-420 (1997); Bateman et 
aL, Nuc. Acids Res. 27:260-262 (1999); and Sonnhammer et al., Nuc. Acids Res. 26:320-322- 
(1998)). 

In another embodiment, the prostate cancer sequences are transmembrane 
proteins. Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. 
30 They may have an intracellular domain, an extracellular domain, or both. The intracellular 
domains of such proteins may have a number of functions including those already described 
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for intracellular proteins. For example, the intracellular domain may have enzymatic activity 
and/or may serve as a binding site for additional proteins. Frequently the intracellular 
domain of transmembrane proteins serves both roles. For example certain reeeptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
5 of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

Transmembrane proteins may contain from one to many transmembrane 
domains. For example, receptor tyrosine kinases, certain cytokine receptors, receptor 
guanylyl cyclases and receptor serine/threonine protein kinases contain a single 

10 transmembrane domain. However, various other proteins including channels and adenylyl 
cyclases contain numerous transmembrane domains. Many important cell surface receptors 
such as G protein coupled receptors (GPCRs) are classified as "seven transmembrane 
domain" proteins, as they contain 7 membrane spanning regions. Characteristics of 
transmembrane domains include approximately 20 consecutive hydrophobic amino acids that 

15 may be followed by charged amino acids. Therefore, upon analysis of the amino acid 
sequence of a particular protein, the localization and number of transmembrane domains 
within the protein may be predicted (see, e.g. PSORT web site http://psortnibb.ac.jp/). 
Important transmembrane protein receptors include, but are not limited to the insulin 
receptor, insulin-like growth factor receptor, human growth hormone receptor, glucose 

20 transporters, transferrin receptor, epidermal growth factor receptor, low density lipoprotein 
receptor, epidermal growth factor receptor, leptin receptor, interleukin receptors, e.g. IL-1 
receptor, IL-2 receptor, 

The extracellular domains of transmembrane proteins are diverse; however, 
conserved motifs are found repeatedly among various extracellular domains. Conserved 

23 structure and/or functions have been ascribed to different extracellular motifs. Many 

extracellular domains are involved in binding to other molecules. In one aspect, extracellular 
domains are found on receptors. Factors that bind the receptor domain include circulating 
ligands, which may be peptides, proteins, or small molecules such as adenosine and the like. 
For example, growth factors such as EGF, FGF and PDGF are circulating growth factors that 

30 bind to their cognate receptors to initiate a variety of cellular responses. Other factors include 
cytokines, mitogenic factors, neurotrophic factors and the like. Extracellular domains also 
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bind to cell-associated molecules. In this respect, they mediate cell-cell interactions. Cell- 
associated ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol (GPI) 
anchor, or may themselves be transmembrane proteins. Extracellular domains also associate 
with the extracellular matrix and contribute to the maintenance of the cell structure. 
5 Prostate cancer proteins that are transmembrane are particularly preferred in 

the present invention as they are readily accessible targets for immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 
in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ. Alternatively, antibodies can also label intracellular proteins, in which case samples are 

10 typically penneablized to provide access to intracellular proteins. 

It will also be appreciated by those in the art that a transmembrane protein can 
be made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

IS In another embodiment, the prostate cancer proteins are secreted proteins; the 

secretion of which can be either constitutive or regulated. These proteins have a signal 
peptide or signal sequence that targets the molecule to the secretory pathway. Secreted 
proteins are involved in numerous physiological events; by virtue of their circulating nature, 
they serve to transmit signals to various other cell types. The secreted protein may function in 

20 an autocrine manner (acting on the cell that secreted the factor), a paracrine manner (acting 
on cells in close proximity to the cell that secreted the factor) or an endocrine maimer (acting 
on cells at a distance). Thus secreted molecules find use in modulating or altering numerous 
aspects of physiology. Prostate cancer proteins that are secreted proteins are particularly 
preferred in the present invention as they serve as good targets for diagnostic markers, e.g., 

25 for blood, plasma, serum, or stool tests. 

Use of prostate cancer nucleic acids 

As described above, prostate cancer sequence is initially identified by 
substantial nucleic acid and/or amino acid sequence homology or linkage to the prostate 
30 cancer sequences outlined herein. Such homology can be based upon the overall nucleic acid 
or amino acid sequence, and is generally determined as outlined below, using either 
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homology programs or hybridization conditions. Typically, linked sequences on a mRNA are 
found on the same molecule. 

The prostate cancer nucleic acid sequences of the invention, e.g., the 
sequences in Tables 1-16, can be fragments of larger genes, i.e., they are nucleic acid 
5 segments. "Genes" in this context includes coding regions, non-coding regions, and mixtures 
of coding and non-coding regions. Accordingly, as will be appreciated by those in the art, 
using the sequences provided herein, extended sequences, in either direction, of the prostate 
cancer genes can be obtained, using techniques well known in the art for cloning either longer 
sequences or the full length sequences; see Ausubel, et al. 9 supra. Much can be done by 
10 informatics and many sequences can be clustered to include multiple sequences 
corresponding to a single gene, e.g., systems such as UniGene (see, 

http^/www.ncbi.nlm.nih.gov/UniGene/). 

Once the prostate cancer nucleic acid is identified, it can be cloned and, if 
necessary, its constituent parts recombined to form the entire prostate cancer nucleic acid 

15 coding regions or the entire mRNA sequence. Once isolated from its natural source, e.g., 
contained within a plasmid or other vector or excised therefrom as a linear nucleic acid 
segment, the recombinant prostate cancer nucleic acid can be further-used as a probe to 
identify and isolate other prostate cancer nucleic acids, e.g., extended coding regions. It can 
also be used as a "precursor" nucleic acid to make modified or variant prostate cancer nucleic 

20 acids and proteins. 

The prostate cancer nucleic acids of the present invention are used in several 
ways. In a first embodiment, nucleic acid probes to the prostate cancer nucleic acids are 
made and attached to biochips to be used in screening and diagnostic methods, as outlined 
below, or for administration, e.g., for gene therapy, vaccine, and/or antisense applications. 

25 Alternatively, the prostate cancer nucleic acids that include coding regions of prostate cancer 
proteins can be put into expression vectors for the expression of prostate cancer proteins, 
again for screening purposes or for administration to a patient 

In a preferred embodiment, nucleic acid probes to prostate cancer nucleic 
acids (both the nucleic acid sequences outlined in the figures and/or the complements thereof) 

30 are made. The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the prostate cancer nucleic acids, i.e. the target sequence (either the target 
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sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 
hybridization of the target sequence and the probes of the present invention occurs. As 
outlined below, this complementarity need not be perfect; there may be any number of base 
pair mismatches which will interfere with hybridization between the target sequence and the 
5 single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 
conditions, the sequence is not a complementary target sequence. Tims, by "substantially 
complementary" herein is meant that the probes are sufficiently complementary to the target 
sequences to hybridize under normal reaction conditions, particularly high stringency 

10 conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single 
and partially double stranded The strandedness of the probe is dictated by the structure, 
composition, and properties of the target sequence. In general, the nucleic acid probes range 
from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 

15 and from about 30 to about 50 bases being particularly preferred. That is, generally whole 
genes are not used. In some embodiments, much longer nucleic acids can be used, up to 
hundreds of bases. 

In a preferred embodiment, more than one probe per sequence is used, with 
either overlapping probes or probes to different sections of the target being used. That is, 

20 two, three, four or more probes, with three being preferred, are used to build in a redundancy 
for a particular target The probes can be overlapping (i.e., have some sequence in common), 
or separate. In some cases, PCR primers may be used to amplify signal for higher sensitivity. 

As will be appreciated by those in the art, nucleic acids can be attached or 
immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 

25 equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 
removal as outlined below. The binding can typically be covalent or non-covalent. By "non- 
covalent binding" and grammatical equivalents herein is meant one or more of electrostatic, 
hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent 

30 attachment of a molecule, such as, streptavidin to the support and the non-covalent binding of 
the biotinylated probe to the streptavidin. By "covalent binding" and grammatical 
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equivalents herein is meant that the two moieties, the solid support and the probe, are 
attached by at least one bond, including sigma bonds, pi bonds and coordination bonds. 
Covalent bonds can be formed directly between the probe and the solid support or can be 
formed by a cross linker or by inclusion of a specific reactive group on either the solid 
5 support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

In general, the probes are attached to the biochip in a wide variety of ways, as 
will be appreciated by those in the art As described herein, the nucleic acids can either be 
synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
10 the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid 
support" or other grammatical equivalents herein is meant a material that can be modified to 
contain discrete individual sites appropriate for the attachment or association of the nucleic 
acid probes and is amenable to at least one detection method. As will be appreciated by those 

15 in the art, the number of possible substrates are very large, and include, but are not limited to, 
glass and modified or functionalized glass, plastics (including acrylics, polystyrene and 
copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, 
polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica- 
based materials including silicon and modified silicon, carbon, metals, inorganic glasses, 

20 plastics, etc. In general, the substrates allow optical detection and do not appreciably 

fluoresce. A preferred substrate is described in copending application entitled Reusable Low 
Fluorescent Plastic Biochip, U.S. Application Serial No. 09/270,214, filed March 15, 1999, 
herein incorporated by reference in its entirety. 

Generally the substrate is planar, although as will be appreciated by those in 

25 the art, other configurations of substrates may be used as well. For example, the probes may 
be placed on the inside surface of a tube, for flow-through sample analysis to minimize 
sample volume. Similarly, the substrate may be flexible, such as a flexible foam, including 
closed cell foams made of particular plastics. 

In a preferred embodiment, the surface of the biochip and the probe may be 

30 derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 
the biochip is derivatized with a chemical functional group including, but not limited to, 
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amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred Using these functional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g. using linkers as are known in the art; e.g., 
5 homo-or hetero-bifunctional linkers as are well known (see 1994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used. 

In this embodiment, oligonucleotides are synthesized as is known in the art, 

10 and then attached to the surface of the solid support As will be appreciated by those skilled 
in the art, either the 5* or 3* terminus may be attached to the solid support, or attachment may 
be via an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very 
strong, yet non-covalent. For example, biotinylated oligonucleotides can be made, which 

15 bind to surfaces covalendy coated with streptavidin, resulting in attachment 

Alternatively, the oligonucleotides may be synthesized on the surface, as is 
known in the art For example, photoactivation techniques utilizing photopolymerization 
compounds and techniques are used In a preferred embodiment, the nucleic acids can be 
synthesized in situ, using well known photolithographic techniques, such as those described 

20 in WO 95/25116; WO 95/35505; US. Patent Nos. 5,700,637 and 5,445,934; and references 
cited within, all of which are expressly incorporated by reference; these methods of 
attachment form the basis of the Affimetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression 
level of prostate cancer-associated sequences. These assays are typically performed in 

25 conjunction with reverse transcription. In such assays, a prostate cancer-associated nucleic 
acid sequence acts as a template in an amplification reaction (e,g., Polymerase Chain 
Reaction, or PCR). fii a quantitative amplification, the amount of amplification product will 
be proportional to the amount of template in the original sample. Comparison to appropriate 
controls provides a measure of the amount of prostate cancer-associated RNA, Methods of 

30 quantitative amplification are well known to those of skill in the art. Detailed protocols for 
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quantitative PCR are provided, e.g., in Innis et at, PCR Protocols, A Guide to Metlwds and 
Applications (1990). 

In some embodiments, a TaqMan based assay is used to measure expression. 
TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5* fluorescent 
5 dye and a 3* quenching agent. The probe hybridizes to a POR product, but cannot itself be 
extended due to a blocking agent at the 3' end. When the PCR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g. 9 AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5* fluorescent dye and the 3' 
quenching agent, thereby resulting in an increase in fluorescence as a function of 

10 amplification (see, e.g., literature provided by Perkin-Elmer, e.g., www2.perkin-elmer.com). 

Other suitable amplification methods include, but are not limited to, ligase 
chain reaction (LCR) (see Wu & Wallace, Genomics 4:560 (1989), Landegren et al., Science 
241:1077 (1988), and Barringer et aL, Gene 89:117 (1990)), transcription amplification 
(Kwoh etal., Proc. Natl Acad Sci. USA 86:1173 (1989)), self-sustained sequence replication 

15 (Guatelli et dL % Proc. Nat. Acad Sci. USA 87: 1874 (1990)), dot PCR, and linker adapter PCR, 
etc. 

Expression of prostate cancer proteins from nucleic acids 

In a preferred embodiment, prostate cancer nucleic acids, e.g., encoding 

20 prostate cancer proteins are used to make a variety of expression vectors to express prostate 
cancer proteins which can then be used in screening assays, as described below. Expression 
vectors and recombinant DNA technology are well known to those of skill in the art (see, 
e.g. 9 Ausubel, supra, and Gene Expression Systems (Fernandez & Hoeffler, eds, 1999)) and 
are used to express proteins. The expression vectors may be either self-replicating 

25 extrachromosomal vectors or vectors which integrate into a host genome. Generally, these 
expression vectors include transcriptional and translation^ regulatory nucleic acid operably 
linked to the nucleic acid encoding the prostate cancer protein. The term "control sequences" 
refers to DNA sequences used for the expression of an operably linked coding sequence in a 
particular host organism. Control sequences that are suitable for prokaryotes, e.g., include a 

30 promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are 
known to utilize promoters, polyadenylation signals, and enhancers. 
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Nucleic acid is "operably linked" when it is placed into a functional 
relationship with another nucleic acid sequence. For example, DNA for a presequence or 
secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein 
that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked 

5 to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site 
is operably linked to a coding sequence if it is positioned so as to facilitate translation. 
Generally, "operably linked" means that the DNA sequences being linked are contiguous, 
and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers 
do not have to be contiguous. linking is typically accomplished by ligation at convenient 

10 restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are 
used in accordance with conventional practice. Transcriptional and translational regulatory 
nucleic acid will generally be appropriate to the host cell used to express the prostate cancer 
protein. Numerous types of appropriate expression vectors, and suitable regulatory 
sequences are known in the art for a variety of host cells. 

15 In general, transcriptional and translational regulatory sequences may include, 

but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and 
stop sequences, translational start and stop sequences, and enhancer or activator sequences. 
In a preferred embodiment, the regulatory sequences include a promoter and transcriptional 
start and stop sequences. 

20 Promoter sequences encode either constitutive or inducible promoters. The 

promoters may be either naturally occurring promoters or hybrid promoters. Hybrid 
promoters, which combine elements of more than one promoter, are also known in the art, 
and are useful in the present invention. 

In addition, an expression vector may comprise additional elements. For 

25 example, the expression vector may have two replication systems, thus allowing it to be 
maintained in two organisms, e.g. in mammalian or insect cells for expression and in a 
procaryotic host for cloning and amplification. Furthermore, for integrating expression 
vectors, the expression vector contains at least one sequence homologous to the host cell 
genome, and preferably two homologous sequences which flank the expression construct 

30 The integrating vector may be directed to a specific locus in the host cell by selecting the 
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appropriate homologous sequence for inclusion in the vector. Constructs for integrating 
vectors are well known in the art (e.g., Fernandez & Hoeffler, supra). 

In addition, in a preferred embodiment, the expression vector contains a 
selectable marker gene to allow the selection of transformed host cells. Selection genes are 
5 well known in the art and will vary with the host cell used. 

The prostate cancer proteins of the present invention are produced by culturing 
a host cell transformed with an expression vector containing nucleic acid encoding a prostate 
cancer protein, under the appropriate conditions to induce or cause expression of the prostate 
cancer protein. Conditions appropriate for prostate cancer protein expression will vary with 

10 the choice of the expression vector and the host cell, and will be easily ascertained by one 
skilled in the art through routine experimentation or optimization. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proliferation of the host cell, while the use of an inducible promoter requires the appropriate 
growth conditions for induction. In addition, in some embodiments, the timing of the harvest 

15 is important. For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect 
and animal cells, including mammalian cells. Of particular interest are Saccharomyces 
cerevisiae and other yeasts, E. coli, Bacillus subtilis, Sf9 cells, C129 cells, 293 cells, 

20 Neurospora, BHK, CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial 
cells), THP1 cells (a macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the prostate cancer proteins are expressed in 
mammalian cells. Mammalian expression systems are also known in the art, and include 
retroviral and adenoviral systems. One expression vector system is a retroviral vector system 

25 such as is generally described in PCT/US97/01019 and PCT/US97/01048, both of which are 
hereby expressly incorporated by reference. Of particular use as mammalian promoters are 
the promoters from mammalian viral genes, since the viral genes are often highly expressed 
and have a broad host range. Examples include the S V40 early promoter, mouse mammary 
tumor virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, 

30 and the CMV promoter {see, e.g., Fernandez & Hoeffler, supra). Typically, transcription 
termination and polyadenylation sequences recognized by mammalian cells are regulatory 
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regions located 3' to the translation stop codon and thus, together with the promoter elements, 
flank the coding sequence. Examples of transcription terminator and polyadenlyation signals 
include those derived form SV40. 

Hie methods of introducing exogenous nucleic acid into mammalian hosts, as 
5 well as other hosts, is well known in the art, and will vary with the host cell used 
Techniques include dextran-mediated transfection, calcium phosphate precipitation, 
polybrene mediated transfection, protoplast fusion, electroporation, viral infection, 
encapsulation of the polynucleotide^) in liposomes, and direct microinjection of the DNA 
into nuclei. 

10 In a preferred embodiment, prostate cancer proteins are expressed in bacterial 

systems. Bacterial expression systems are well known in the art Promoters from 
bacteriophage may also be used and are known in the art. In addition, synthetic promoters 
and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of the trp and lac 
promoter sequences. Furthermore, a bacterial promoter can include naturally occurring 

15 promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and 
initiate transcription. In addition to a functioning promoter sequence, an efficient ribosome 
binding site is desirable. The expression vector may also include a signal peptide sequence 
that provides for secretion of the prostate cancer protein in bacteria. The protein is either 
secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 

20 between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial 
expression vector may also include a selectable marker gene to allow for the selection of 
bacterial strains that have been transformed. Suitable selection genes include genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, 

25 such as those in the histidine, tryptophan and leucine biosynthetic pathways. These 

components are assembled into expression vectors. Expression vectors for bacteria are well 
known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, 
and Streptococcus lividans, among others (e.g., Ffernandez & Hoeffler, supra). Hie bacterial 
expression vectors are transformed into bacterial host cells using techniques well known in 

30 the art, such as calcium chloride treatment, electroporation, and others. 
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In one embodiment, prostate cancer proteins are produced in insect cells. 
Expression vectors for the transformation of insect cells, and in particular, baculovirus-based 
expression vectors, are well known in the art 

In a preferred embodiment, prostate cancer protein is produced in yeast cells. 
5 Yeast expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and C maltosa, Hansenula polymorpha, 
Kluyveromyces fragilis and K. lactis, Pichia guillerimondii and P. pastoris, 
Schizosaccharomyces pombe, and Yarrowia lipolytica. 

The prostate cancer protein may also be made as a fusion protein, using 
10 techniques well known in the art Thus, e.g., for the creation of monoclonal antibodies, if the 
desired epitope is small, the prostate cancer protein may be fused to a carrier protein to form 
an immunogen. Alternatively, the prostate cancer protein may be made as a fusion protein to 
increase expression, or for other reasons. For example, when the prostate cancer protein is a 
prostate cancer peptide, the nucleic acid encoding the peptide may be linked to other nucleic 
15 acid for expression purposes. 

In a preferred embodiment, the prostate cancer protein is purified or isolated 
after expression. Prostate cancer proteins may be isolated or purified in a variety of ways 
known to those skilled in the art depending on what other components are present in the 
sample. Standard purification methods include electrophoretic, molecular, immunological 
20 and chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the prostate cancer 
protein may be purified using a standard anti-prostate cancer protein antibody column. 
Ultrafiltration and diafiltration techniques, in conjunction with protein concentration, are also 
useful. For general guidance in suitable purification techniques, see Scopes, Protein 
25 Purification (1982). The degree of purification necessary will vary depending on the use of 
the prostate cancer protein. In some instances no purification will be necessary. 

Once expressed and purified if necessary, the prostate cancer proteins and 
nucleic acids are useful in a number of applications. They may be used as immunoselection 
reagents, as vaccine reagents, as screening agents, etc. 

30 
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Variants of prostate cancer proteins 

In one embodiment, the prostate cancer proteins are derivative or variant 
prostate cancer proteins as compared to the wild-type sequence. That is, as outlined more 
fully below, the derivative prostate cancer peptide will often contain at least one amino acid 
5 substitution, deletion or insertion, with amino acid substitutions being particularly preferred 
The amino acid substitution, insertion or deletion may occur at any residue within the 
prostate cancer peptide. 

Also included within one embodiment of prostate cancer proteins of the 
present invention are amino acid sequence variants. These variants typically fall into one or 

10 more of three classes: substitutional, insertional or deletional variants. These variants 

ordinarily are prepared by site specific mutagenesis of nucleotides in the DNA encoding the 
prostate cancer protein, using cassette or PCR mutagenesis or other techniques well known in 
the art, to produce DNA encoding the variant, and thereafter expressing the DNA in 
recombinant cell culture as outlined above. However, variant prostate cancer protein 

15 fragments having up to about 100-150 residues may be prepared by in vitro synthesis using 
established techniques. Amino acid sequence variants are characterized by the predetermined 
nature of the variation, a feature that sets them apart from naturally occurring allelic or 
interspecies variation of the prostate cancer protein amino acid sequence. The variants 
typically exhibit the same qualitative biological activity as the naturally occurring analogue, 

20 ..although variants can also be selected which have modified characteristics as will be more 
fully outlined below. 

While the site or region for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 

25 conducted at the target codon or region and the expressed prostate cancer variants screened 
for the optimal combination of desired activity. Techniques for making substitution 
mutations at predetermined sites in DNA having a known sequence are well known, e.g., 
M13 primer mutagenesis and PCR mutagenesis. Screening of the mutants is done using 
assays of prostate cancer protein activities. 

30 Amino acid substitutions are typically of single residues; insertions usually 

will be on the order of from about 1 to 20 amino acids, although considerably laiger 
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insertions may be tolerated Deletions range from about 1 to about 20 residues, although in 
some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to 
arrive at a final derivative. Generally these changes are done on a few amino acids to 
5 minimize the alteration of the molecule. However, larger changes may be tolerated in certain 
circumstances. When small alterations in the characteristics of the prostate cancer protein are 
desired, substitutions are generally made in accordance with the amino acid substitution 
relationships provided in the definition section. 

The variants typically exhibit the same qualitative biological activity and will 

10 elicit the same immune response as the naturally-occurring analog, although variants also are 
selected to modify the characteristics of the prostate cancer proteins as needed. Alternatively, 
the variant may be designed such that the biological activity of the prostate cancer protein is 
altered. For example, glycosylation sites may be altered or removed. 

Substantial changes in function or immunological identity are made by 

15 selecting substitutions that are less conservative than those described above. For example, 
substitutions may be made which more significantly affect: the structure of the polypeptide 
backbone in the area of the alteration, for example the alpha-helical or beta-sheet structure; 
the charge or hydrophobicity of the molecule at the target site; or the bulk of the side chain. 
The substitutions which in general are expected to produce the greatest changes in the 

20 polypeptide's properties are those in which (a) a hydrophilic residue, e.g. seryl or threonyl is 
substituted for (or by) a hydrophobic residue, e.g. leucyl, isoleucyl, phenylalanyl, valyl or 
alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue 
having an electropositive side chain, e.g. lysyl, arginyl, or histidyl, is substituted for (or by) 
an electronegative residue, e.g. glutamyl or aspartyl; or (d) a residue having a bulky side 

25 chain, e.g. phenylalanine, is substituted for (or by) one not having a side chain, e.g. glycine. 

Covalent modifications of prostate cancer polypeptides are included within the 
scope of this invention. One type of covalent modification includes reacting targeted amino 
acid residues of a prostate cancer polypeptide with an organic derivatizing agent that is 
capable of reacting with selected side chains or the N-or C-terminal residues of a prostate 

30 cancer polypeptide. Derivatization with Afunctional agents is useful, for instance, for 

crosslinking prostate cancer polypeptides to a water-insoluble support matrix or surface for 
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use in the method for purifying anti-prostate cancer polypeptide antibodies or screening 
assays, as is more fully described below. Commonly used crosslinking agents include, e.g., 
l,l-bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., 
esters with 4-azidosalicylic acid, homobifunctional imidoesters, including disuccinimidyl 
5 esters such as 3,3'-dithiobis(succinimidylpropionate), bifimctional maleimides such as bis-N- 
maleimido-1 ,8-octane and agents such as methyl-3«((p-azidophenyl)ditWo)propioimidate. 

Other modifications include deamidation of glutaminyl and asparaginyl 
residues to the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of 
proline and lysine, phosphorylation of hydroxyl groups of seryl, threonyl or tyrosyl residues, 

10 methylation of the amino groups of the lysine, arginine, and histidine side chains (Creighton, 
Proteins: Structure and Molecular Properties, pp. 79-86 (1983)), acetylation of the N- 
tenninal amine, and amidation of any C-tenninal carboxyl group. 

Another type of covalent modification of the prostate cancer polypeptide 
included within the scope of this invention comprises altering the native glycosylation pattern 

15 of the polypeptide. "Altering the native glycosylation pattern" is intended for purposes 

herein to mean deleting one or more carbohydrate moieties found in native sequence prostate 
cancer polypeptide, and/or adding one or more glycosylation sites that are not present in the 
native sequence prostate cancer polypeptide. Glycosylation patterns can be altered in many 
ways. For example the use of different cell types to express prostate cancer-associated 

20 sequences can result in different glycosylation patterns. 

Addition of glycosylation sites to prostate cancer polypeptides may also be 
accomplished by altering the amino acid sequence thereof. The alteration may be made, e.g., 
by the addition of, or substitution by, one or more serine or threonine residues to the native 
sequence prostate cancer polypeptide (for O-linked glycosylation sites). The prostate cancer 

25 amino acid sequence may optionally be altered through changes at the DNA level, 

particularly by mutating the DNA encoding the prostate cancer polypeptide at preselected 
bases such that codons are generated that will translate into the desired amino acids. 

Another means of increasing the number of carbohydrate moieties on the 
prostate cancer polypeptide is by chemical or enzymatic coupling of glycosides to the 

30 polypeptide. Such methods are described in the art, e.g., in WO 87/05330, and in Aplin & 
Wriston, CRC Crfc Rev. Biochem., pp. 259-306 (1981). 
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Removal of carbohydrate moieties present on the prostate cancer polypeptide 
may be accomplished chemically or enzymatically or by mutational substitution of codons 
encoding for amino acid residues that serve as targets for glycosylation. Chemical 
deglycosylation techniques are known in the art and described, for instance, by Halrimuddin, 
5 et Arch Biochenu Biophys., 259:52 (1987) and by Edge et al f AnaL Biochenu, 118:131 
(1981). Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the 
use of a variety of endo-and exo-glycosidases as described by Thotakura et al, Metfu 
EwymoU 138:350(1987). 

Another type of covalent modification of prostate cancer comprises linking the 

10 prostate cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., 

polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in 
U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192 or 4,179,337. 

Prostate cancer polypeptides of the present invention may also be modified in 
a way to form chimeric molecules comprising a prostate cancer polypeptide fused to another, 

15 heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 
molecule comprises a fusion of a prostate cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the prostate cancer polypeptide. The 
presence of such epitope-tagged forms of a prostate cancer polypeptide can be detected using 

20 an antibody against the tag polypeptide. Also, provision of the epitope tag enables the 
prostate cancer polypeptide to be readily purified by affinity purification using an anti-tag 
antibody or another type of affinity matrix that binds to the epitope tag. In an alternative 
embodiment, the chimeric molecule may comprise a fusion of a prostate cancer polypeptide 
with an immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of 

25 the chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

Various tag polypeptides and their respective antibodies are well known in the 
art Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; 
HIS6 and metal chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field et 
aL, Mol Cell Biol 8:2159-2165 (1988)); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 

30 9E10 antibodies thereto (Evan et aL, Molecular and Cellular Biology 5:3610-3616 (1985)); 
and the Herpes Simplex vims glycoprotein D (gD) tag and its antibody (Paborsky et aL, 
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Protein Engineering 3(6):547-553 (1990)), Other tag polypeptides include the Hag-peptide 
(Hopp eial. 9 BioTechnology 6:1204-1210 (1988)); the KT3 epitope peptide (Martin et al. 9 
Science 255:192-194 (1992)); tubulin epitope peptide (Skinner et aL, J. Biol. Chem. 
266:15163-15166 (1991)); and the T7 gene 10 protein peptide tag (Lutz-Freyennuth et aL, 
5 Proc. Natl Acad ScL USA 87:6393-6397 (1990)). 

Also included are other prostate cancer proteins of the prostate cancer family, 
and prostate cancer proteins from other organisms, which are cloned and expressed as 
outlined below. Thus, probe or degenerate polymerase chain reaction (PCR) primer 
sequences may be used to find other related prostate cancer proteins from humans or other 

10 organisms. As will be appreciated by those in the art, particularly useful probe and/or PCR 
primer sequences include the unique areas of the prostate cancer nucleic acid sequence. As is 
generally known in the art, preferred PCR primers are from about 15 to about 35 nucleotides 
in length, with from about 20 to about 30 being preferred, and may contain inosine as needed. 
The conditions for the PCR reaction are well known in the art (e.g., Innis, PCR Protocols, 

15 supra). 

Antibodies to prostate cancer proteins 

In a preferred embodiment, when the prostate cancer protein is to be used to 
generate antibodies, e.g., for immunotherapy or immunodiagnosis, the prostate cancer protein 

20 should share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 
made to a smaller prostate cancer protein will be able to bind to the full-length protein, 
particularly linear epitopes. In a preferred embodiment; the epitope is unique; that is, 

25 antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are known to the skilled artisan 
(e.g., Coligan, supra; and Harlow & Lane, supra). Polyclonal antibodies can be raised in a 
mammal, e.g., by one or more injections of an immunizing agent and, if desired, an adjuvant. 
Typically, the immunizing agent and/or adjuvant will be injected in the mammal by multiple 

30 subcutaneous or intraperitoneal injections. The immunizing agent may include a protein 
encoded by a nucleic acid of the figures or fragment thereof or a fusion protein thereof. It 

52 



WO 02/30268 



PCT/US01/32045 



may be useful to conjugate the immunizing agent to a protein known to be immunogenic in 
the mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean 
trypsin inhibitor. Examples of adjuvants which may be employed include Freund's complete 
5 adjuvant and MPL-1DM adjuvant (monophosphoryl lipid A, synthetic trehalose 

dicorynomycolate). The immunization protocol may be selected by one skilled in the art 
without undue experimentation. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal 
antibodies may be prepared using hybridoma methods, such as those described by Kohler & 

10 Milstein, Nature 256:495 (1975). In a hybridoma method, a mouse, hamster, or other 
appropriate host animal, is typically immunized with an immunizing agent to elicit 
lymphocytes that produce or are capable of producing antibodies that will specifically bind to 
the immunizing agent Alternatively, the lymphocytes may be immunized in vitro. The 
immunizing agent will typically include a polypeptide encoded by a nucleic acid of Tables 1- 

15 16 fragment thereof, or a fusion protein thereof. Generally, either peripheral blood 

lymphocytes ("PBLs") are used if cells of human origin are desired, or spleen cells or lymph 
node cells are used if non-human mammalian sources are desired. Hie lymphocytes are then 
fused with an immortalized cell line using a suitable fusing agent, such as polyethylene 
glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: Principles and Practice, 

20 pp. 59-103 (1986)). Immortalized cell lines are usually transformed mammalian cells, 
particularly myeloma cells of rodent, bovine and human origin. Usually, rat or mouse 
myeloma cell lines are employed The hybridoma cells may be cultured in a suitable culture 
medium that preferably contains one or more substances that inhibit the growth or survival of 
the unfused, immortalized cells. For example, if the parental cells lack the enzyme 

25 hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium 
for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ('THAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

In one embodiment, the antibodies are bispecific antibodies. Bispecific 
antibodies are monoclonal, preferably human or humanized, antibodies that have binding 

30 specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 
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protein encoded by a nucleic acid Tables 1-16 or a fragment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 

5 In a preferred embodiment, the antibodies to prostate cancer protein are 

capable of reducing or eliminating a biological function of a prostate cancer protein, as is 
described below. That is, the addition of anti-prostate cancer protein antibodies (either 
polyclonal or preferably monoclonal) to prostate cancer tissue (or cells containing prostate 
cancer) may reduce or eliminate the prostate cancer. Generally, at least a 25% decrease in 
10 activity, growth, size or the like is preferred, with at least about 50% being particularly 
preferred and about a 95-100% decrease being especially preferred 

In a preferred embodiment the antibodies to the prostate cancer proteins are 
humanized antibodies (e.g., Xenerex Biosciences, Mederex, Inc., Abgenix, Inc., Protein 
Design Labs,Inc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric 
15 molecules of immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, 
Fab, Fab', F(ab')2 or other antigen-binding subsequences of antibodies) which contain 
minimal sequence derived from non-human immunoglobulin. Humanized antibodies include 
human immunoglobulins (recipient antibody) in which residues from a complementary 
determining region (CDR) of the recipient are replaced by residues from a CDR of a non- 
20 human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, 
affinity and capacity. In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
may also comprise residues which are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, a humanized antibody will comprise 
25 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework (FR) regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise 
at least a portion of an immunoglobulin constant region (Fc), typically that of a human 
30 immunoglobulin (Jones et at, Nature 321:522-525 (1986); Riechmann et aL, Nature 

332:323-329 (1988); andPresta, Curr. Op. Struct. Biol. 2:593-596 (1992)). Humanization 
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can be essentially performed following the method of Winter and co-workers (Jones et al, 
Nature 321:522-525 (1986); Riechmann et al, Nature 332:323-327 (1988); Verhoeyen et al., 
Science 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. Accordingly, such humanized antibodies are 
5 chimeric antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact 
human variable domain has been substituted by the corresponding sequence from a non- 
human species. 

Human antibodies can also be produced using various techniques known in the 
art, including phage display libraries (Hoogenboom & Winter, /. Mol Biol 227:381 (1991); 

10 Marks et al, J. Mol Biol 222:581 (1991)), The techniques of Cole et al. and Boerner et al. 
are also available for the preparation of human monoclonal antibodies (Cole et al, 
Monoclonal Antibodies and Cancer Therapy, p. 77 (1985) and Boerner et al., J. Immunol 
147(l):86-95 (1991)). Similarly, human antibodies can be made by introducing of human 
immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 

15 immunoglobulin genes have been partially or completely inactivated. Upon challenge, 

human antibody production is observed, which closely resembles that seen in humans in all 
respects, including gene rearrangement, assembly, and antibody repertoire. This approach is 
described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 
5,661,016, and in the following scientific publications: Marks et al. 9 Bio/Technology 10:779- 

20 783 (1992); Lonberg et al., Nature 368:856-859 (1994); Morrison, Nature 368:812-13 
(1994); Fishwild et al.. Nature Biotechnology 14:845-51 (1996); Neuberger, Nature 
Biotechnology 14:826 (1996); Lonberg & Huszar, Intern. Rev. Immunol 13:65-93 (1995). « 

By immunotherapy is meant treatment of prostate cancer with an antibody 
raised against prostate cancer proteins. As used herein,imrnunotherapy can be passive or 

25 active. Passive immunotherapy as defined herein is the passive transfer of antibody to a 

recipient (patient). Active immunization is the induction of antibody and/or T-cell responses 
in a recipient (patient). Induction of an immune response is the result of providing the 
recipient with an antigen to which antibodies are raised. As appreciated by one of ordinary 
skill in the art, the antigen may be provided by injecting a polypeptide against which 

30 antibodies are desired to be raised into a recipient, or contacting the recipient with a nucleic 
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acid capable of expressing the antigen and under conditions for expression of the antigen, 
leading to an immune response. 

In a preferred embodiment the prostate cancer proteins against which 
antibodies are raised are secreted proteins as described above. Without being bound by 
5 theory, antibodies used for treatment, bind and prevent the secreted protein from binding to 
its receptor, thereby inactivating the secreted prostate cancer protein. 

In another preferred embodiment, the prostate cancer protein to which 
antibodies are raised is a transmembrane protein. Without being bound by theory, antibodies 
used for treatment, bind the extracellular domain of the prostate cancer protein and prevent it 

10 from binding to other proteins, such as circulating ligands or cell-associated molecules. The 
antibody may cause down-regulation of the transmembrane prostate cancer protein. As will 
be appreciated by one of ordinary skill in the art, the antibody may be a competitive, non- 
competitive or uncompetitive inhibitor of protein binding to the extracellular domain of the 
prostate cancer protein. The antibody is also an antagonist of the prostate cancer protein. 

15 Further, the antibody prevents activation of the transmembrane prostate cancer protein. In 
one aspect, when the antibody prevents the binding of other molecules to the prostate cancer 
protein, the antibody prevents growth of the cell. The antibody may also be used to target or 
sensitize the cell to cytotoxic agents, including, but not limited to TNF-oc, TNF-P, EL-1, INF-y 
and ILr2, or chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, 

20 methotrexate, and the like. In some instances the antibody belongs to a sub-type that 
activates serum complement when complexed with the transmembrane protein thereby 
mediating cytotoxicity or antigen-dependent cytotoxicity (ADCC). Thus, prostate cancer is 
treated by administering to a patient antibodies directed against the transmembrane prostate 
cancer protein. Antibody-labeling may activate a co-toxin, localize a toxin payload, or 

25 otherwise provide means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector 
moiety. The effector moiety can be any number of molecules, including labelling moieties 
such as radioactive labels or fluorescent labels, or can be a therapeutic moiety. In one aspect 
the therapeutic moiety is a small molecule that modulates the activity of the prostate cancer 

30 protein. In another aspect the therapeutic moiety modulates the activity of molecules 

associated with or in close proximity to the prostate cancer protein. The therapeutic moiety 
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may inhibit enzymatic activity such as protease or collagenase or protein kinase activity 
associated with prostate cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic 
agent In this method, targeting the cytotoxic agent to prostate cancer tissue or cells, results 
5 in a reduction in the number of afflicted cells, thereby reducing symptoms associated with 
prostate cancer. Cytotoxic agents are numerous and varied and include, but are not limited 
to, cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their 
corresponding fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A 
chain, curcin, crotin, phenomycin, enomycin and the like. Cytotoxic agents also include 

10 radiochemicals made by conjugating radioisotopes to antibodies raised against prostate 
cancer proteins, or binding of a radionuclide to a chelating agent that has been covalently 
attached to the antibody. Targeting the therapeutic moiety to transmembrane prostate cancer 
proteins not only serves to increase the local concentration of therapeutic moiety in the 
prostate cancer afflicted area, but also serves to reduce deleterious side effects that may be 

15 associated with the therapeutic moiety. 

In another preferred embodiment, the prostate cancer protein against which the 
antibodies are raised is an intracellular protein. In this case, the antibody may be conjugated 
to a protein which facilitates entry into the cell. In one case, the antibody enters the cell by 
endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 

20 the individual or cell. Moreover, wherein the prostate cancer protein can be targeted within a 
cell, i.e., the nucleus, an antibody thereto contains a signal for that target localization, i.e., a 
nuclear localization signal. 

The prostate cancer antibodies of the invention specifically bind to prostate 
cancer proteins. By "specifically bind" herein is meant that the antibodies bind to the protein 

25 with a Kd of at least about 0. 1 mM, more usually at least about 1 pM, preferably at least about 
0.1 pM or better, and most preferably, 0.01 \M or better. Selectivity of binding is also 
important 

Detection of prostate cancer sequence for diagnostic and therapeutic applications 

30 In one aspect, the RNA expression levels of genes are determined for different 

cellular states in the prostate cancer phenotype. Expression levels of genes in normal tissue 
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(i.e., not undergoing prostate cancer) and in prostate cancer tissue (and in some cases, for 
varying severities of prostate cancer that relate to prognosis, as outlined below) are evaluated 
to provide expression profiles. An expression profile of a particular cell state or point of 
development is essentially a "fingerprint" of the state. While two states may have any 
5 particular gene similarly expressed, the evaluation of a number of genes simultaneously 

allows the generation of a gene expression profile that is reflective of the state of the cell. By 
comparing expression profiles of cells in different states, information regarding which genes 
are important (including both up- and down-regulation of genes) in each of these states is 
obtained. Then, diagnosis may be performed or confirmed to determine whether a tissue 

10 sample has the gene expression profile of normal or cancerous tissue. This will provide for 
molecular diagnosis of related conditions. 

"Differential expression," or grammatical equivalents as used herein, refers to 
qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 

15 qualitatively have its expression altered, including an activation or inactivation, in, e.g., 
normal versus prostate cancer tissue. Genes may be turned on or turned off in a particular 
state, relative to another state thus permitting comparison of two or more states. A 
qualitatively regulated gene will exhibit an expression pattern within a state or cell type 
which is detectable by standard techniques. Some genes will be expressed in one state or cell 

20 type, but not in both. Alternatively, the difference in expression may be quantitative, e.g., in 
that expression is increased or decreased; i.e., gene expression is either upregulated, resulting 
in an increased amount of transcript, or do wnregulated, resulting in a decreased amount of 
transcript. The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Aff ymetrix 

25 GeneChip™ expression arrays, Lockhart, Nature Biotechnology 14: 1675-1680 (1996), 

hereby expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PCR, northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is at least 
about 50%, more preferably at least about 100%, more preferably at least about 150%, more 

30 preferably at least about 200%, with from 300 to at least 1000% being especially preferred. 
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Evaluation may be at the gene transcript, or the protein level. The amount of 
gene expression may be monitored using nucleic acid probes to the DNA or RNA equivalent 
of the gene transcript, and the quantification of gene expression levels, or, alternatively, the 
final gene product itself (protein) can be monitored, e.g., with antibodies to the prostate 
5 cancer protein and standard immunoassays (ELJSAs, etc.) or other techniques, including 
mass spectroscopy assays, 2D gel electrophoresis assays, etc. Proteins corresponding to 
prostate cancer genes, i.e., those identified as being important in a prostate cancer phenotype, 
can be evaluated in a prostate cancer diagnostic test 

In a preferred embodiment, gene expression monitoring is performed 

10 simultaneously on a number of genes. Multiple protein expression monitoring can be 

performed as well. Similarly, these assays may be performed on an individual basis as well. 

In this embodiment, the prostate cancer nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of prostate cancer sequences 
in a particular cell. The assays are further described below in the example. PCR techniques 

15 can be used to provide greater sensitivity. 

In a preferred embodiment nucleic acids encoding the prostate cancer protein 
are detected. Although DNA or RNA encoding the prostate cancer protein may be detected, 
of particular interest are methods wherein an mRNA encoding a prostate cancer protein is 
detected. Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is 

20 complementary to and hybridizes with the mRNA and includes, but is not limited to, 

oligonucleotides, cDNA or RNA. Probes also should contain a detectable label, as defined 
herein. In one method the mRNA is detected after immobilizing the nucleic acid to be 
examined on a solid support such as nylon membranes and hybridizing the probe with the 
sample. Following washing to remove the non-specific&lly bound probe, the label is 

25 detected. In another method detection of the mRNA is performed in situ. In this method 
permeabilized cells or tissue samples are contacted with a detectably labeled nucleic acid 
probe for sufficient time to allow the probe to hybridize with the target mRNA. Following 
washing to remove the non-specifically bound probe, the label is detected. For example a 
digoxygenin labeled riboprobe (RNA probe) that is complementary to the mRNA encoding a 

30 prostate cancer protein is detected by binding the digoxygenin with an anti-digoxygenin 
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secondary antibody and developed with nitro blue tetrazolium and 5-bromo-4-chloio-3- 
indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins 
as described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
5 assays. The prostate cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing prostate cancer sequences are used in diagnostic assays. This can be performed on 
an individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes and/or corresponding 
10 polypeptides. 

As described and defined herein, prostate cancer proteins, including 
intracellular, transmembrane or secreted proteins, find use as markers of prostate cancer. 
Detection of these proteins in putative prostate cancer tissue allows for detection or diagnosis 
of prostate cancer. In one embodiment, antibodies are used to detect prostate cancer proteins. 

15 A preferred method separates proteins from a sample by electrophoresis on a gel (typically a 
denaturing and reducing protein gel, but may be another type of gel, including isoelectric 
focusing gels and the like). Following separation of proteins, the prostate cancer protein is 
detected, e.g., by immunoblotting with antibodies raised against the prostate cancer protein. 
Methods of immunoblotting are well known to those of ordinary skill in the art. 

20 In another preferred method, antibodies to the prostate cancer protein find use 

in in situ imaging techniques, e.g., in histology (e.g., Methods in Cell Biology: Antibodies in 
Cell Biology, volume 37 (Asai, ecL 1993)). In this method cells are contacted with from one 
to many antibodies to the prostate cancer protein(s). Following washing to remove non- 
specific antibody binding, the presence of the antibody or antibodies is detected. In one 

25 embodiment the antibody is detected by incubating with a secondary antibody that contains a 
detectable label. In another method the primary antibody to the prostate cancer protein(s) 
contains a detectable label, e.g. an enzyme marker that can act on a substrate. In another 
preferred embodiment each one of multiple primary antibodies contains a distinct and 
detectable label. This method finds particular use in simultaneous screening for a plurality of 

30 prostate cancer proteins. As will be appreciated by one of ordinary skill in the art, many 
other histological imaging techniques are also provided by the invention. 
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In a preferred embodiment the label is detected in a fluoiometer which has the 
ability to detect and distinguish emissions of different wavelengths. In addition, a 
fluorescence activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing prostate 
5 cancer from blood, serum, plasma, stool, and other samples. Such samples, therefore, are 
useful as samples to be probed or tested for the presence of prostate cancer proteins. 
Antibodies can be used to detect a prostate cancer protein by previously described 
immunoassay techniques including ELBA, immunoblotting (western blotting), 
immunoprecipitation, BIACORE technology and the like. Conversely, the presence of 
10 antibodies may indicate an immune response against an endogenous prostate cancer protein. 

In a preferred embodiment, in situ hybridization of labeled prostate cancer 
nucleic acid probes to tissue arrays is done. For example, arrays of tissue samples, including 
prostate cancer tissue and/or normal tissue, are made. In situ hybridization (see, e.g., 
Ausubel, supra) is then performed. When comparing the fingerprints between an individual 
IS and a standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on 
the findings. It is further understood that the genes which indicate the diagnosis may differ 
from those which indicate the prognosis and molecular profiling of the condition of the cells 
may lead to distinctions between responsive or refractory conditions or may be predictive of 
outcomes. 

20 In a preferred embodiment; the prostate cancer proteins, antibodies, nucleic 

acids, modified proteins and cells containing prostate cancer sequences are used in prognosis 
assays. As above, gene expression profiles can be generated that correlate to prostate cancer, 
in terms of long term prognosis. Again, this may be done on either a protein or gene level, 
with the use of genes being preferred. As above, prostate cancer probes may be attached to 

25 biochips for the detection and quantification of prostate cancer sequences in a tissue or 

patient. The assays proceed as outlined above for diagnosis. PCR method may provide more 
sensitive and accurate quantification. 

Assays for therapeutic compounds 
30 In a preferred embodiment members of the proteins, nucleic acids, and 

antibodies as described herein are used in drug screening assays. The prostate cancer 
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proteins, antibodies, nucleic acids, modified proteins and cells containing prostate cancer 
sequences are used in drug screening assays or by evaluating the effect of drug candidates on 
a "gene expression profile" or expression profile of polypeptides. In a preferred embodiment, 
the expression profiles are used, preferably in conjunction with high throughput screening 
5 techniques to allow monitoring for expression profile genes after treatment with a candidate 
agent (e.g., Zlokamik, et aL, Science 279:84-8 (1998); Heid, Genome Res 6:986-94, 1996). 

In a preferred embodiment, the prostate cancer proteins, antibodies, nucleic 
acids, modified proteins and cells containing the native or modified prostate cancer proteins 
are used in screening assays. That is, the present invention provides novel methods for 

10 screening for compositions which modulate the prostate cancer phenotype or an identified 
physiological function of a prostate cancer protein. As above, this can be done on an 
individual gene level or by evaluating the effect of drug candidates on a "gene expression 
profile". In a preferred embodiment, the expression profiles are used, preferably in 
conjunction with high throughput screening techniques to allow monitoring for expression 

15 profile genes after treatment with a candidate agent, see Zlokamik, supra. 

Having identified the differentially expressed genes herein, a variety of assays 
may be executed. In a preferred embodiment, assays may be run on an individual gene or 
protein level. That is, having identified a particular gene as up regulated in prostate cancer, 
test compounds can be screened for the ability to modulate gene expression or for binding to 

20 the prostate cancer protein. 'Modulation" thus includes both an increase and a decrease in 
gene expression. The preferred amount of modulation will depend on the original change of 
the gene expression in normal versus tissue undergoing prostate cancer, with changes of at 
least 10%, preferably 50%, more preferably 100-300%, and in some embodiments 300- 
1000% or greater. Thus, if a gene exhibits a 4-fold increase in prostate cancer tissue 

25 compared to normal tissue, a decrease of about four-fold is often desired; similarly, a 10-fold 
decrease in prostate cancer tissue compared to normal tissue often provides a target value of a 
10-fold increase in expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes 
and the quantification of gene expression levels, or, alternatively, the gene product itself can 

30 be monitored, e.g., through the use of antibodies to the prostate cancer protein and standard 
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immunoassays. Proteomics and separation techniques may also allow quantification of 
expression. . 

In a preferred embodiment, gene expression or protein monitoring of a number 
of entities, i.e., an expression profile, is monitored simultaneously. Such profiles will 
5 typically involve a plurality of those entities described herein.. 

In this embodiment, the prostate cancer nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of prostate cancer sequences 
in a particular cell. Alternatively, PCR may be used. Thus, a series, e.g., of microtiter plate, 
may be used with dispensed primers in desired wells. A PCR reaction can then be performed 
10 and analyzed for each well. 

Expression monitoring can be performed to identify compounds that modify 
the expression of one or more prostate cancer-associated sequences, e.g., a polynucleotide 
sequence set out in Tables 1-16. Generally, in a preferred embodiment, a test modulator is 
added to the cells prior to analysis. Moreover, screens are also provided to identify agents 
IS that modulate prostate cancer, modulate prostate cancer proteins, bind to a prostate cancer 
protein, or interfere with the binding of a prostate cancer protein and an antibody or other 
binding partner. 

The term "test compound" or "drug candidate" or •'modulator" or grammatical 
equivalents as used herein describes any molecule, e.g., protein, oligopeptide, small organic 

20 molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or 

indirectly alter the prostate cancer phenotype or the expression of a prostate cancer sequence, 
e.g., a nucleic acid or protein sequence. In preferred embodiments, modulators alter 
expression profiles, or expression profile nucleic acids or proteins provided herein. In one 
embodiment, the modulator suppresses a prostate cancer phenotype, e.g. to a normal tissue 

25 fingerprint In another embodiment, a modulator induced a prostate cancer phenotype. 

Generally, a plurality of assay mixtures are run in parallel with different agent concentrations 
to obtain a differential response to the various concentrations. Typically, one of these 
concentrations serves as a negative control, i.e., at zero concentration or below the level of 
detection. 

30 Drug candidates encompass numerous chemical classes, though typically they 

are organic molecules, preferably small organic compounds having a molecular weight of 
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more than 100 and less than about 2,500 daltons. Preferred small molecules are less than 
2000, or less than 1500 or less than 1000 or less than 500 D. Candidate agents comprise 
functional groups necessary for structural interaction with proteins, particularly hydrogen 
bonding, and typically include at least an amine, caibonyl, hydroxyl or carboxyl group, 
5 preferably at least two of the functional chemical groups. The candidate agents often 
comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic 
structures substituted with one or more of the above functional groups. Candidate agents are 
also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, 
pyrimidines, derivatives, structural analogs or combinations thereof. Particularly preferred 
10 are peptides. 

In one aspect, a modulator will neutralize the effect of a prostate cancer 
protein. By "neutralize" is meant that activity of a protein is inhibited or blocked and the 
consequent effect on the cell. 

In certain embodiments, combinatorial libraries of potential modulators will be 
15 screened for an ability to bind to a prostate cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
and activity of those variant compounds. Often, high throughput screening (HTS) methods 
20 are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve 
providing a library containing a large number of potential therapeutic compounds (candidate 
compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 
25 display a desired characteristic activity. The compounds thus identified can serve as 

conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 

A combinatorial chemical library is a collection of diverse chemical 
compounds generated by either chemical synthesis or biological synthesis by combining a 
number of chemical "building blocks" such as reagents. For example, a linear combinatorial 
30 chemical library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of 
chemical building blocks called amino acids in every possible way for a given compound 
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length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical 
compounds can be synthesized through such combinatorial mixing of chemical building 
blocks (Gallop et aL, J. Med. Chem. 37(9): 1233-1251 (1994)). 

Preparation and screening of combinatorial chemical libraries is well known to 
5 those of skill in the art Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent No. 5,010,175, Furka, Pept. Prot. Res. 37:487-493 
(1991), Houghton et aL, Nature, 354:84-88 (1991)), peptoids (PCT Publication No WO 
91/19735), encoded peptides (PCT Publication WO 93/20242), random bio-oligomers (PCT 
Publication WO 92/00091), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as 

10 hydantoins, benzodiazepines and dipeptides (Hobbs et al. f Proc. Nat. Acad. Sci. USA 
90:6909-6913 (1993)), vinylogous polypeptides (Hagihara et al, J. Amer. Chem. Soc. 
114:6568 (1992)), nonpeptidal peptidomimetics with aBeta-D-Glucose scaffolding 
(Hirschmann etal, /- Amer. Chem. Soc. 114:9217-9218 (1992)), analogous organic syntheses 
of small compound libraries (Chen et aL, J. Amer. Chem. Soc. 116:2661 (1994)), 

15 oligocarbamates (Cho, et aL 9 Science 261: 1303 (1993)), and/or peptidyl phosphonates 

(Campbell et aL, J. Org. Chem. 59:658 (1994)). See, generally, Gordon et aL, J. Med. Chem. 
37:1385 (1994), nucleic acid libraries (see, e.g. 9 Strategene, Corp.), peptide nucleic acid 
libraries (see, e.g., U.S. Patent 5,539,083), antibody libraries (see, e.g., Vaughn et aL, Nature 
Biotechnology 14(3):309-314 (1996), and PCT/US96/10287), carbohydrate libraries (see, 

20 e.g., Iiang et aL., Science 274:1520-1522 (1996), and U.S. Patent No. 5,593,853), and small 
organic molecule libraries (see, e.g., benzodiazepines, Baum, C&EN, Jan 18, page 33 (1993); 
isoprenoids, U.S. Patent No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent No. 
5,549,974; pyrrolidines, U.S. Patent Nos. 5,525,735 and 5,519,134; morpholino compounds, 
U.S. Patent No. 5,506,337; benzodiazepines, U.S. Patent No. 5,288,514; and the like). 

25 Devices for the preparation of combinatorial libraries are commercially 

available (see, e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, 
Rainin, Woburn, MA, 433 A Applied Biosystems, Foster City, CA, 9050 Plus, Millipoie, 
Bedford, MA). 

A number of well known robotic systems have also been developed for 
30 solution phase chemistries. These systems include automated workstations like the 

automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, 
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Japan) and many robotic systems utilizing robotic arms (Zymate n, Zymark Corporation, 
Hopkinton, Mass.; Orca, Hewlett-Packard, Palo Alto, Calif.), which mimic the manual 
synthetic operations performed by a chemist Any of the above devices are suitable for use 
with the present invention. Hie nature and implementation of modifications to these devices 
5 (if any) so that they can operate as discussed herein will be apparent to persons skilled in the 
relevant art. In addition, numerous combinatorial libraries are themselves commercially 
available {see, e.g., ComGenex, Princeton, NJ., Asinex, Moscow, Ru, Tripos, Inc., St. Louis, 
MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, 
Columbia, MD, etc.). 

10 The assays to identify modulators are amenable to high throughput screening. 

Preferred assays thus detect enhancement or inhibition of prostate cancer gene transcription, 
inhibition or enhancement of polypeptide expression, and inhibition or enhancement of 
polypeptide activity. 

High throughput assays for the presence, absence, quantification, or other 

15 properties of particular nucleic acids or protein products are well known to those of skill in 
the art Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high throughput screening methods for proteins, 
U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic acid 
binding (i.e., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 

20 throughput methods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available 
(see, e.g., Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). TTiese systems 
typically automate entire procedures, including all sample and reagent pipetting, liquid 

25 dispensing, timed incubations, and final readings of the microplate in detectors) appropriate 
for the assay. These configurable systems provide high throughput and rapid start up as well 
as a high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 

30 transcription, ligand binding, and the like. 
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In one embodiment, modulators are proteins, often naturally occurring 
proteins or fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing 
proteins, or random or directed digests of proteinaceous cellular extracts, may be used. In 
this way libraries of proteins may be made for screening in the methods of the invention. 
5 Particularly preferred in this embodiment are libraries of bacterial, fungal, viral, and 
mammalian proteins, with the latter being preferred, and human proteins being especially 
preferred. Particularly useful test compound will be directed to the class of proteins to which 
the target belongs, e.g., substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 
10 30 amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 

Vs. 

to about 15 being particularly preferred The peptides may be digests of naturally occurring 
proteins as is outlined above, random peptides, or "biased" random peptides. By 
"randomized" or grammatical equivalents herein is meant that each nucleic acid and peptide 
consists of essentially random nucleotides and amino acids, respectively. Since generally 

15 these random peptides (or nucleic acids, discussed below) are chemically synthesized, they 
may incorporate any nucleotide or amino acid at any position. Hie synthetic process can be 
designed to generate randomized proteins or nucleic acids, to allow the formation of all or 
most of the possible combinations over the length of the sequence, thus forming a library of 
randomized candidate bioactive proteinaceous agents. 

20 In one embodiment, the library is fully randomized, with no sequence 

preferences or constants at any position. In a preferred embodiment, the library is biased. 
That is, some positions within the sequence are either held constant, or are selected from a 
limited number of possibilities. For example, in a preferred embodiment, the nucleotides or 
amino acid residues are randomized within a defined class, e.g., of hydrophobic amino acids, 

25 hydrophilic residues, sterically biased (either small or large) residues, towards the creation of 
nucleic acid binding domains, the creation of cysteines, for cross-linking, prolines for SH-3 
domains, serines, threonines, tyrosines or histidines for phosphorylation sites, etc., or to 
purines, etc. 

Modulators of prostate cancer can also be nucleic acids, as defined below. As 
30 described above generally for proteins, nucleic acid modulating agents may be naturally 
occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. For 
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example, digests of procaryotic or eucaryotic genomes may be used as is outlined above for 
proteins. 

In certain embodiments, the activity of a prostate cancer-associated protein is 
down-regulated, or entirely inhibited, by the use of antisense polynucleotide, Le., a nucleic 
5 acid complementary to, and which can preferably hybridize specifically to, a coding mRNA 
nucleic acid sequence, e.g., a prostate cancer protein mRNA, or a subsequence thereof. 
Binding of the antisense polynucleotide to the mRNA reduces the translation and/or stability 
of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise 

10 naturally-occurring nucleotides, or synthetic species formed from naturally-occurring 
subunits or their close homologs. Antisense polynucleotides may also have altered sugar 
moieties or inter-sugar linkages. Exemplary among these are the phosphorothioate and other 
sulfur containing species which are known for use in the art. Analogs are comprehended by 
this invention so long as they function effectively to hybridize with the prostate cancer 

15 protein mRNA, See, e.g., Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

Such antisense polynucleotides can readily be synthesized using recombinant 
means, or can be synthesized in vitro. Equipment for such synthesis is sold by several 
vendors, including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art 

20 Antisense molecules as used herein include antisense or sense 

oligonucleotides. Sense oligonucleotides can, e.g., be employed to block transcription by 
binding to the anti-sense strand The antisense and sense oligonucleotide comprise a single- 
stranded nucleic acid sequence (either RNA or DNA) capable of binding to target mRNA 
(sense) or DNA (antisense) sequences for prostate cancer molecules. Antisense or sense 

25 oligonucleotides, according to the present invention, comprise a fragment generally at least 
about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein 
is described in, e.g., Stein & Cohen (Cancer Res. 48:2659 (1988 and van der Krol et al. 
(BioTechniques 6:958 (1988)). 

30 In addition to antisense polynucleotides, ribozymes can be used to target and 

inhibit transcription of prostate cancer-associated nucleotide sequences. A ribozyme is an 
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RNA molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes 
have been described, including group I ribozymes, hammerhead ribozymes, hairpin 
ribozymes, RNase P, and axhead ribozymes (see, e.g. t Castanotto et al, Adv. in 
Pharmacology 25: 289-317 (1994) for a general review of the properties of different 
5 ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel et al, 
NucL Acids Res. 18:299-304 (1990); European Patent Publication No. 0 360 257; U.S. Patent 
No. 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., 
WO 94/26877; Ojwang et al, Proc. Natl Acad. Sci. USA 90:6340-6344 (1993); Yamada et 

10 al., Human Gene Therapy 1:39-45 (1994); Leavitt et al, Proc. Natl Acad. Sci. USA 92:699- 
703 (1995); Leavitt et al., Human Gene Therapy 5:1151-120 (1994); and Yamada et al., 
Virology 205: 121-126 (1994)). 

Polynucleotide modulators of prostate cancer may be introduced into a cell 
containing the target nucleotide sequence by formation of a conjugate with a ligand binding 

15 molecule, as described in WO 91/04753. Suitable ligand binding molecules include, but are 
not limited to, cell surface receptors, growth factors, other cytokines, or other ligands that 
bind to cell surface receptors. Preferably, conjugation of the ligand binding molecule does 
not substantially interfere with the ability of the ligand binding molecule to bind to its 
corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 

20 or its conjugated version into the cell. Alternatively, a polynucleotide modulator of prostate 
cancer may be introduced into a cell containing die target nucleic acid sequence, e.g., by 
formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 
understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in addition* to methods of treatment 

25 As noted above, gene expression monitoring is conveniently used to test 

candidate modultors (e.g., protein, nucleic acid or small molecule). After the candidate agent 
has been added and the cells allowed to incubate for some period of time, the sample 
containing a target sequence to be analyzed is added to the biochip. If required, the target 
sequence is prepared using known techniques. For example, the sample may be treated to 

30 lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 

amplification such as PCR performed as appropriate. For example, an in vitro transcription 
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with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids 
are labeled with biotm-FITC or PE, or with cy3 or cy5. 

Li a preferred embodiment, the target sequence is labeled with, e.g., a 
fluorescent, a chemiluminescent, a chemical, or a radioactive signal, to provide a means of 
5 detecting the target sequence's specific binding to a probe. The label also can be an enzyme, 
such as, alkaline phosphatase or horseradish peroxidase, which when provided with an 
appropriate substrate produces a product that can be detected. Alternatively, the label can be 
a labeled compound or small molecule, such as an enzyme inhibitor, that binds but is not 
catalyzed or altered by the enzyme. The label also can be a moiety or compound, such as, an 

10 epitope tag or biotin which specifically binds to streptavidin. For the example of biotin, the 
streptavidin is labeled as described above, thereby, providing a detectable signal for the 
bound target sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

As will be appreciated by those in the art, these assays can be direct 
hybridization assays or can comprise "sandwich assays", which include the use of multiple 

15 probes, as is generally outlined in U.S. Patent Nos. 5,681,702, 5,597,909, 5,545,730, 

5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 
5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by 
reference. In this embodiment, in general, the target nucleic acid is prepared as outlined 
above, and then added to the biochip comprising a plurality of nucleic acid probes, under 

20 conditions that allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, 
including high, moderate and low stringency conditions as outlined above. The assays are 
generally run under stringency conditions which allows formation of the label probe 
hybridization complex only in the presence of target. Stringency can be controlled by 

25 altering a step parameter that is a thermodynamic variable, including, but not limited to, 
temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, 
organic solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is 
generally outlined in U.S. Patent No. 5,681,697. Thus it may be desirable to perform certain 

30 steps at higher stringency conditions to reduce non-specific binding. 
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The reactions outlined herein may be accomplished in a variety of ways. 
Components of the reaction may be added simultaneously, or sequentially, in different orders, 
with preferred embodiments outlined below. In addition, the reaction may include a variety 
of other reagents. These include salts, buffers, neutral proteins, e.g. albumin, detergents, etc. 
5 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target 

The assay data are analyzed to determine the expression levels, and changes in 

10 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the prostate cancer 
phenotype. In one embodiment, screening is performed to identify modulators that can 
induce or suppress a particular expression profile, thus preferably generating the associated 
phenotype. In another embodiment, e.g., for diagnostic applications, having identified 

15 differentially expressed genes important in a particular state, screens can be performed to 
identify modulators that alter expression of individual genes. In an another embodiment, 
screening is performed to identify modulators that alter a biological function of the 
expression product of a differentially expressed gene. Again, having identified the 
importance of a gene in a particular state, screens are performed to identify agents that bind 

20 and/or modulate the biological activity of the gene product 

In addition screens can be done for genes that are induced in response to a 
candidate agent After identifying a modulator based upon its ability to suppress a prostate 
cancer expression pattern leading to a normal expression pattern, or to modulate a single 
prostate cancer gene expression profile so as to mimic the expression of the gene from 

25 normal tissue, a screen as described above can be performed to identify genes that are 
specifically modulated in response to the agent Comparing expression profiles between 
normal tissue and agent treated prostate cancer tissue reveals genes that are not expressed in 
normal tissue or prostate cancer tissue, but are expressed in agent treated tissue. These agent- 
specific sequences can be identified and used by methods described herein for prostate cancer 

30 genes or proteins. In particular these sequences and the proteins they encode find use in 
marking or identifying agent treated cells. In addition, antibodies can be raised against the 
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agent induced proteins and used to target novel therapeutics to the treated prostate cancer 
tissue sample. 

Thus, in one embodiment, a test compound is administered to a population of 
prostate cancer cells, that have an associated prostate cancer expression profile. By 
5 "administration" or "contacting" herein is meant that the candidate agent is added to the cells 
in such a manner as to allow the agent to act upon the cell, whether by uptake and 
intracellular action, or by action at the cell surface. In some embodiments, nucleic acid 
encoding a proteinaceous candidate agent (i.e., a peptide) may be put into a viral construct 
such as an adenoviral or retroviral construct, and added to the cell, such that expression of 
10 the peptide agent is accomplished, e.g., PCT US97/01019. Regulatable gene therapy systems 
can also be used. 

Once the test compound has been administered to the cells, the cells can be 
washed if desired and are allowed to incubate under preferably physiological conditions for 
some period of time. The cells are then harvested and a new gene expression profile is 

15 generated, as outlined herein. 

Thus, e.g„ prostate cancer tissue may be screened for agents that modulate, 
e.g., induce or suppress the prostate cancer phenotype. A change in at least one gene, 
preferably many, of the expression profile indicates that the agent has an effect on prostate 
cancer activity. By defining such a signature for the prostate cancer phenotype, screens for 

20 new drugs that alter the phenotype can be devised. With this approach, the drug target need 
not be known and need not be represented in the original expression screening platform, nor 
does the level of transcript for the target protein need to change. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 

25 - differentially expressed gene as important in a particular state, screening of modulators of 
either the expression of the gene or the gene product itself can be done. The gene products of 
differentially expressed genes are sometimes referred to herein as "prostate cancer proteins" 
or a "prostate cancer modulatory protein". The prostate cancer modulatory protein may be a 
fragment, or alternatively, be the full length protein to the fragment encoded by the nucleic 

30 acids of Tables 1-16. Preferably, the prostate cancer modulatory protein is a fragment In a 
preferred embodiment, the prostate cancer amino acid sequence which is used to determine 
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sequence identity or similarity is encoded by a nucleic acid of Tables 1-16. In another 
embodiment, the sequences are naturally occurring allelic variants of a protein encoded by a 
nucleic acid of Tables 1-16. In another embodiment, the sequences are sequence variants as 
further described herein. 
5 Preferably, the prostate cancer modulatory protein is a fragment of 

approximately 14 to 24 amino acids long. More preferably the fragment is a soluble 
fragment Preferably, the fragment includes a non-transmembrane region. In a preferred 
embodiment, the fragment has an N-terminal Cys to aid in solubility. In one embodiment, the 
C-terminus of the fragment is kept as a free acid and the N-terminus is a free amine to aid in 

10 coupling, i.e., to cysteine. 

In one embodiment the prostate cancer proteins are conjugated to an 
immunogenic agent as discussed herein. In one embodiment the prostate cancer protein is 
conjugated to BS A. 

Measurements of prostate cancer polypeptide activity, or of prostate cancer or 

15 the prostate cancer phenotype can be performed using a variety of assays. For example, the 
effects of the test compounds upon the function of the prostate cancer polypeptides can be 
measured by examining parameters described above. A suitable physiological change that 
affects activity can be used to assess the influence of a test compound on the polypeptides of 
this invention. When the functional consequences are determined using intact cells or 

20 animals, one can also measure a variety of effects such as, in the case of prostate cancer 
associated with tumors, tumor growth, tumor metastasis, neovascularization, hormone 
release, transcriptional changes to both known and uncharacterized genetic markers (e.g., 
northern blots), changes in cell metabolism such as cell growth orpH changes, and changes 
in intracellular second messengers such as cGMP. In the assays of the invention, mammalian 

25 prostate cancer polypeptide is typically used, e.g., mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in 
vitro. For example, a prostate cancer polypeptide is first contacted with a potential modulator 
and incubated for a suitable amount of time, e.g,, from 0.5 to 48 hours. Li one embodiment, 
the prostate cancer polypeptide levels are determined in vitro by measuring the level of 

30 protein or mRNA. The level of protein is measured using immunoassays such as western 
blotting, ELESA and the like with an antibody that selectively binds to the prostate cancer 
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polypeptide or a fragment thereof. For measurement of mRNA, amplification, e.g., using 
PCR, LCR, or hybridization assays, e.g., northern hybridization, RNAse protection, dot 
blotting, are preferred. The level of protein or mRNA is detected using directly or indirectly 
labeled detection agents, e.g., fluoiescently or radioactively labeled nucleic acids, 
5 radioactively or enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using the prostate cancer 
protein promoter operably linked to a reporter gene such as luciferase, green fluorescent 
protein, CAT, or p-gal. The reporter construct is typically transfected into a cell. After 
treatment with a potential modulator, the amount of reporter gene transcription, translation, or 
10 activity is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 

differentially expressed gene as important in a particular state, screening of modulators of the 

« 

expression of the gene or the gene product itself can be done. The gene products of 
15 differentially expressed genes are sometimes referred to herein as "prostate cancer proteins." 
The prostate cancer protein may be a fragment, or alternatively, be the full length protein to a 
fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes 
is performed. Typically, the expression of only one or a few genes are evaluated. In another 
20 embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or 
25 isolated gene product is used; that is, the gene products of one or more differentially 

expressed nucleic acids are made. For example, antibodies are generated to the protein gene 
products, and standard immunoassays are run to determine the amount of protein present 
Alternatively, cells comprising the prostate cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a prostate 
30 cancer protein and a candidate compound, and determining the binding of the compound to 
the prostate cancer protein. Preferred embodiments utilize the human prostate cancer protein, 
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although other mammalian proteins may also be used, e.g. for the development of animal 
models of human disease. In some embodiments, as outlined herein, variant or derivative 
prostate cancer proteins may be used. 

Generally, in a preferred embodiment of the methods herein, the prostate 
5 cancer protein or the candidate agent is non-diffusably bound to an insoluble support having 
isolated sample receiving areas (e.g. a microtiter plate, an array, etc.). The insoluble 
supports may be made of any composition to which the compositions can be bound, is readily 
separated from soluble material, and is otherwise compatible with the overall method of 
screening. The surface of such supports may be solid or porous and of any convenient shape, 

10 Examples of suitable insoluble supports include microtiter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 
and samples. The particular manner of binding of the composition is not crucial so long as it 

15 is compatible with the reagents and overall methods of the invention, maintains the activity of 
the composition and is nondiffusable. Preferred methods of binding include the use of 
antibodies (which do not sterically block either the ligand binding site or activation sequence 
when the protein is bound to the support), direct binding to "sticky" or ionic supports, 
chemical crosslinking, the synthesis of the protein or agent on the surface, etc. Following 

20 binding of the protein or agent, excess unbound material is removed by washing. The sample 
receiving areas may then be blocked through incubation with bovine serum albumin (BSA), 
casein or other innocuous protein or other moiety. 

In a preferred embodiment, the prostate cancer protein is bound to the support, 
and a test compound is added to the assay. Alternatively, the candidate agent is bound to the 

25 support and the prostate cancer protein is added. Novel binding agents include specific 
antibodies, non-natural binding agents identified in screens of chemical libraries, peptide 
analogs, etc. Of particular interest are screening assays for agents that have a low toxicity for 
human cells. A wide variety of assays may be used for this purpose, including labeled in 
vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays for 

30 protein binding, functional assays (phosphorylation assays, etc.) and the like. 
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The determination of the binding of the test modulating compound to the 
prostate cancer protein may be done in a number of ways. In a preferred embodiment, the 
compound is labeled, and binding determined directly, e.g., by attaching all or a portion of 
the prostate cancer protein to a solid support, adding a labeled candidate agent (e.g., a 
5 fluorescent label), washing off excess reagent, and determining whether the label is present 
on the solid support Various blocking and washing steps may be utilized as appropriate. 

In some embodiments, only one of the components is labeled, e.g., the 
proteins (or proteinaceous candidate compounds) can be labeled. Alternatively, more than 
one component can be labeled with different labels, e.g., l25 I for the proteins and a fluorophor 
10 for the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

In one embodiment, the binding of the test compound is determined by 
competitive binding assay. The competitor is a binding moiety known to bind to the target 
molecule (i.e., a prostate cancer protein), such as an antibody, peptide, binding partner, 

15 ligand, etc. Under certain circumstances, there may be competitive binding between the 

compound and the binding moiety, with the binding moiety displacing the compound. In one 
embodiment, the test compound is labeled. Either the compound, or the competitor, or both, 
is added first to the protein for a time sufficient to allow binding, if present. Incubations may 
be performed at a temperature which facilitates optimal activity, typically between 4 and 

20 40°C. Incubation periods are typically optimized, e.g., to facilitate rapid high throughput 
screening. Typically between 0.1 and 1 hour will be sufficient. Excess reagent is generally 
removed or washed away. The second component is then added, and the presence or absence 
of the labeled component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by the test 

25 compound. Displacement of the competitor is an indication that the test compound is binding 
to the prostate cancer protein and thus is capable of binding to, and potentially modulating, 
the activity of the prostate cancer protein. In this embodiment, either component can be 
labeled. Thus, e.g., if the competitor is labeled, the presence of label in the wash solution 
indicates displacement by the agent Alternatively, if the test compound is labeled, the 

30 presence of the label on the support indicates displacement 
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In an alternative embodiment, the test compound is added first, with 
incubation and washing, followed by the competitor. The absence of binding by the 
competitor may indicate that the test compound is bound to the prostate cancer protein with a 
higher affinity. Thus, if the test compound is labeled, the presence of the label on the 
5 support, coupled with a lack of competitor binding, may indicate that the test compound is 
capable of binding to the prostate cancer protein. 

In a preferred embodiment, the methods comprise differential screening to 
identity agents that are capable of modulating the activity of the prostate cancer proteins. In 
this embodiment, the methods comprise combining a prostate cancer protein and a competitor 
10 in a first sample. A second sample comprises a test compound, a prostate cancer protein, and 
a competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the prostate cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 
15 agent is capable of binding to the prostate cancer protein. 

Alternatively, differential screening is used to identify drug candidates that 
bind to the native prostate cancer protein, but cannot bind to modified prostate cancer 
proteins. The structure of the prostate cancer protein may be modeled, and used in rational 
drug design to synthesize agents that interact with that site. Drug candidates that affect the 
20 activity of a prostate cancer protein are also identified by screening drugs for the ability to 
either enhance or reduce the activity of the protein. 

Positive controls and negative controls may be used in the assays. Preferably 
control and test samples are performed in at least triplicate to obtain statistically significant 
results. Incubation of all samples is for a time sufficient for the binding of the agent to the 
25 protein. Following incubation, samples are washed free of non-specifically bound material 
and the amount of bound, generally labeled agent determined. For example, where a 
radiolabel is employed, the samples may be counted in a scintillation counter to determine the 
amount of bound compound. 

A variety of other reagents may be included in the screening assays. These 
30 include reagents like salts, neutral proteins, e.g. albumin, detergents, etc. which may be used 
to facilitate optimal protein-protein binding and/or reduce non-specific or background 
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interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used the mixture 
of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 

5 compound capable of modulating the activity of a prostate cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising prostate cancer 
proteins. Preferred cell types include almost any cell. The cells contain a recombinant 
nucleic acid that encodes a prostate cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

10 In one aspect, the assays are evaluated in the presence or absence or previous 

or subsequent exposure of physiological signals, e.g. hormones, antibodies, peptides, 
antigens, cytokines, growth factors, action potentials, pharmacological agents including 
chemotherapeutics, radiation, carcinogenics, or other cells (i.e. cell-cell contacts). In another 
example, the determinations are determined at different stages of the cell cycle process. 

15 In this way, compounds that modulate prostate cancer agents are identified. 

Compounds with pharmacological activity are able to enhance or interfere with the activity of 
the prostate cancer protein. Once identified, similar structures are evaluated to identify 
critical structural feature of the compound. 

In one embodiment, a method of inhibiting prostate cancer cell division is 

20 provided. The method comprises administration of a prostate cancer inhibitor. In another 
embodiment, a method of inhibiting prostate cancer is provided. The method comprises 
administration of a prostate cancer inhibitor. In a further embodiment, methods of treating 
cells or individuals with prostate cancer are provided. The method comprises administration 
of a prostate cancer inhibitor. 

25 In one embodiment, a prostate cancer inhibitor is an antibody as discussed 

above. In another embodiment, the prostate cancer inhibitor is an antisense molecule. 

A variety of cell growth, proliferation, and metastasis assays are known to 
those of skill in the art, as described below. 

Soft agar growth or colony formation in suspension 

30 Normal cells require a solid substrate to attach and grow. When the cells are 

transformed, they lose this phenotype and grow detached from the substrate. For example, 
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transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 
grow. Soft agar growth or colony formation in suspension assays can be used to identify 
5 modulators of prostate cancer sequences, which when expressed in host cells, inhibit 

abnormal cellular proliferation and transformation. A therapeutic compound would reduce or 
eliminate the host cells' ability to grow in stirred suspension culture or suspended in semi- 
solid media, such as semi-solid or soft 

Techniques for soft agar growth or colony formation in suspension assays are 
10 described in Freshney, Culture of Animal Cells a Manual of Basic Technique (3 rd ed, 1994), 

•a 

herein incorporated by reference. See also, the methods section of Garkavtsev et al (1996), 
supra, herein incorporated by reference. 

Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until 
15 they touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 
higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
20 pattern of normal surrounding cells. Alternatively, labeling index with ( 3 H)-thymidine at 

saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 
normal phenotype and become contact inhibited and would grow to a lower density. 

In this assay, labeling index with ( 3 H)-thymidine at saturation density is a 
25 preferred method of measuring density limitation of growth. Transformed host cells are 
transfected with a prostate cancer-associated sequence and are grown for 24 hours at 
saturation density in non-limiting medium conditions. The percentage of cells labeling with 
( 3 H)-thymidine is determined autoradiographically. See, Freshney (1994), supra. 
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Growth factor or serum dependence 

Transformed cells have a lower serum dependence than their normal 
counterparts {see, e.g., Temin, J. Natl. Cancer InstL 37:167-175 (1966); Eagle et aL, J. Exp. 
Med. 131:836-879 (1970)); Freshney, supra. This is in part due to release of various growth 
5 factors by the transformed cells. Growth factor or serum dependence of transformed host 
cells can be compared with that of control. 

Tumor specific markers levels 

Tumor cells release an increased amount of certain factors (hereinafter "tumor 
10 specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a higher level than from normal brain cells (see, e.g., 
Gullino, Angiogenesis, tumor vascularization, and potential interference with tumor growth. 
in Biological Responses in Cancer, pp. 178-184 (Mihich (ed.) 1985)). Similarly, Tumor 
angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal 
15 counterparts. See, e.g., Folkman, Angiogenesis and Cancer, Sem Cancer Biol. (1992)). 

Various techniques which measure the release of these factors are described in 
Freshney (1994), supra. Also, see, Unkless et aL , /• Biol Chenu 249:4295-4305 (1974); 
Strickland & Beers, J. Biol. Chem. 251:5694-5702 (1976); Whur et al, Br. J. Cancer 42:305- 
312 (1980); Gullino, Angiogenesis, tumor vascularization, and potential interference with 
20 tumor growth, in Biological Responses in Cancer, pp. 178-184 (Mihich (ed) 1985); 
Freshney Anticancer Res. 5:111-130 (1985). 

Invasiveness into Matrigel 

The degree of invasiveness into Matrigel-or some other extracellular matrix 
25 constituent can be used as an assay to identify compounds that modulate prostate cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
invasiveness of cells into Matrigel or some other extracellular matrix constituent. In this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 
30 Techniques described in Freshney (1994), supra, can be used. Briefly, the 

level of invasion of host cells can be measured by using filters coated with Matrigel or some 
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other extracellular matrix constituent Penetration into the gel, or through to the distal side of 
the filter, is rated as invasiveness, and rated histologically by number of cells and distance 
moved, or by prelabeling the cells with 125 I and counting the radioactivity on the distal side of 
the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 

5 

Tumor growth in vivo 

Effects of prostate cancer-associated sequences on cell growth can be tested in 
transgenic or immune-suppressed mice. Knock-out transgenic mice can be made, in which 
the prostate cancer gene is disrupted or in which a prostate cancer gene is inserted Knock- 

10 out transgenic mice can be made by insertion of a marker gene or other heterologous gene 
into the endogenous prostate cancer gene site in the mouse genome via homologous 
recombination. Such mice can also be made by substituting the endogenous prostate cancer 
gene with a mutated version of the prostate cancer gene, or by mutating the endogenous 
prostate cancer gene, e.g., by exposure to carcinogens. 

15 A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 

containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 
that possess germ cells partially derived from the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 

20 lesion {see, e.g., Capecchi et at, Science 244:1288 (1989)). Chimeric targeted mice can be 
derived according to Hogan et al., Manipulating the Mouse Embryo: A Laboratory Manual, 
Cold Spring Harbor Laboratory (1988) and Teratocarcinomas and Embryonic Stem Cells: A 
Practical Approach* Robertson, ecL, IRL Press, Washington, D.C., (1987). 

Alternatively, various immune-suppressed or immune-deficient host animals 

25 can be used. For example, genetically athymic "nude" mouse {see, e.g., Giovanella et al., /. 
Natl Cancer Inst. 52:921 (1974)), a SOD mouse, a thymectomized mouse, or an irradiated 
mouse {see, e.g., Bradley et al, Br. /. Cancer 38:263 (1978); Selby et al., Br. J. Cancer 
41:52 (1980)) can be used as a host. Transplantable tumor cells (typically about 10 6 cells) 
injected into isogenic hosts will produce invasive tumors in a high proportions of cases, while 

30 normal cells of similar origin will not In hosts which developed invasive tumors, cells 
expressing a prostate cancer-associated sequences are injected subcutaneously. After a 
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suitable length of time, preferably 4-8 weeks, tumor growth is measured (e.g., by volume or 
by its two largest dimensions) and compared to the control. Tumors that have statistically 
significant reduction (using, e.g., Students T test) are said to have inhibited growth. 

S Methods of identifying variant prostate cancer-associated sequences 

Without being bound by theory, expression of various prostate cancer 
sequences is correlated with prostate cancer. Accordingly, disorders based on mutant or 
variant prostate cancer genes may be determined. In one embodiment, the invention provides 
methods for identifying cells containing variant prostate cancer genes, e.g., determining all or 

10 part of the sequence of at least one endogenous prostate cancer genes in a cell. This may be 
accomplished using any number of sequencing techniques. In a preferred embodiment, the 
invention provides methods of identifying the prostate cancer genotype of an individual, e.g., 
determining all or part of the sequence of at least one prostate cancer gene of the individual. 
This is generally done in at least one tissue of the individual, and may include the evaluation 

15 of a number of tissues or different samples of the same tissue. The method may include 
comparing the sequence of the sequenced prostate cancer gene to a known prostate cancer 
gene, i.e M a wild-type gene. 

The sequence of all or part of the prostate cancer gene can then be compared 
to the sequence of a known prostate cancer gene to determine if any differences exist This 

20 can be done using any number of known homology programs, such as Bestfit, etc. In a 
preferred embodiment, the presence of a difference in the sequence between the prostate 
cancer gene of the patient and the known prostate cancer gene correlates with a disease state 
or a propensity for a disease state, as outlined herein. 

In a preferred embodiment, the prostate cancer genes are used as probes to 

25 determine the number of copies of the prostate cancer gene in the genome. 

In another preferred embodiment, the prostate cancer genes are used as probes 
to determine the chromosomal localization of the prostate cancer genes. Information such as 
chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
chromosomal abnormalities such as translocations, and the like are identified in the prostate 

30 cancer gene locus. 
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Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a prostate cancer 
protein or modulator thereof, is administered to a patient By "therapeutically effective dose" 
herein is meant a dose that produces effects for which it is administered The exact dose will 
5 depend on the purpose of the treatment, and will be ascertainable by one skilled in the art 
using known techniques (e.g., Ansel et aL, Pharmaceutical Dosage Forms and Drug 
Delivery; lieberman, Pharmaceutical Dosage Forms (vols. 1-3, 1992), Dekker, ISBN 
0824770846, 082476918X, 0824712692, 0824716981; Lloyd, The Art, Science and 
Technology of Pharmaceutical Compounding (1999); and Pickar, Dosage Calculations 
10 (1999)). As is known in the art, adjustments for prostate cancer degradation, systemic versus 
localized delivery, and rate of new protease synthesis, as well as the age, body weight, 
general health, sex, diet, time of administration, drug interaction and the severity of the 
condition may be necessary, and will be ascertainable with routine experimentation by those 
skilled in the art U.S. Patent Application N. 09/687,576, further discloses the use of 
15 compositions and methods of diagnosis and treatment in prostate cancer is hereby expressly 
incorporated by reference. 

A "patient" for the purposes of the present invention includes both humans 
and other animals, particularly mammals. Thus the methods are applicable to both human 
therapy and veterinary applications. In the preferred embodiment the patient is a mammal, 
20 preferably a primate, and in the most preferred embodiment the patient is human. 

The administration of the prostate cancer proteins and modulators thereof of 
the present invention can be done in a variety of ways as discussed above, including, but not 
limited to, orally, subcutaneously, intravenously, intranasally, transdermally, 
intraperitoneally , intramuscularly, intrapulmonary, vaginally, rectally, or intraocularly . In 
25 some instances, e.g., in the treatment of wounds and inflammation, the prostate cancer 
proteins and modulators may be directly applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a prostate 
cancer protein in a form suitable for administration to a patient In the preferred embodiment, 
the pharmaceutical compositions are in a water soluble form, such as being present as 
30 pharmaceutically acceptable salts, which is meant to include both acid and base addition 
salts. 4 Tharmaceutically acceptable acid addition salt" refers to those salts that retain the 
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biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 
sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
'Tharmaceutically acceptable base addition salts" include those derived from inorganic bases 
such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 
potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamjne, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

The pharmaceutical compositions may also include one or more of the 
following: carrier proteins such as serum albumin; buffers; fillers such as microcrystalline 
cellulose, lactose, corn and other starches; binding agents; sweeteners and other flavoring 
agents; coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit 
dosage forms depending upon the method of administration. For example, unit dosage forms 
suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that prostate cancer protein modulators (e.g., antibodies, 
antisense constructs, ribozymes, small organic molecules, etc.) when administered orally, 
should be protected from digestion. This is typically accomplished either by complexing the 
molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 
packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art 
The compositions for administration will commonly comprise a prostate 
cancer protein modulator dissolved in a pharmaceutically acceptable carrier, preferably an 
aqueous carrier. A variety of aqueous carriers can be used, e.g., buffered saline and the like. 
These solutions are sterile and generally free of undesirable matter. These compositions may 
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be sterilized by conventional, well known sterilization techniques. The compositions may 
contain pharmaceutical^ acceptable auxiliary substances as required to approximate 
physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents 
and the like, e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, 
sodium lactate and the like. The concentration of active agent in these formulations can vary 
widely, and will be selected primarily based on fluid volumes, viscosities, body weight and 
the like in accordance with the particular mode of administration selected and the patient's 
needs (e.g., Remington's Pharmaceutical Science (15th ed., 1980) and Goodman & Gillman, 
The Pharmacologic Basis of Therapeutics (Hardman et aZ.,eds„ 1996)). 

Thus, a typical pharmaceutical composition for intravenous administration 
would be about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per 
patient per day may be used, particularly when the drug is administered to a secluded site and 
not into the blood stream, such as into a body cavity or into a lumen of an organ. 
Substantially higher dosages are possible in topical administration. Actual methods for 
preparing parenterally administrable compositions will be known or apparent to those skilled 
in the art, e.g., Remington's Pharmaceutical Science and Goodman and Gillman, The 
Pharmacologial Basis of Therapeutics, supra. 

The compositions containing modulators of prostate cancer proteins can be 
administered for therapeutic or prophylactic treatments. In therapeutic applications, 
compositions are administered to a patient suffering from a disease (e.g., a cancer) in an 
amount sufficient to cure or at least partially arrest the disease and its complications. An 
amount adequate to accomplish this is defined as a "therapeutically effective dose." Amounts 
effective for this use will depend upon the severity of the disease and the general state of the 
patient's health. Single or multiple administrations of the compositions may be administered 
depending on the dosage and frequency as required and tolerated by the patient In any event, 
the composition should provide a sufficient quantity of the agents of this invention to 
effectively treat the patient. An amount of modulator that is capable of preventing or slowing 
the development of cancer in a mammal is referred to as a "prophylactically effective dose." 
Hie particular dose required for a prophylactic treatment will depend upon the medical 
condition and history of the mammal, the particular cancer being prevented, as well as other 
factors such as age, weight, gender, administration route, efficiency, etc. Such prophylactic 
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treatments may be used, in a mammal who has previously had cancer to prevent a 
recurrence of the cancer, or in a mammal who is suspected of having a significant likelihood 
of developing cancer. 

It will be appreciated that the present prostate cancer protein-modulating 
5 compounds can be administered alone or in combination with additional prostate cancer 
modulating compounds or with other therapeutic agent, e.g., other anti-cancer agents or 
treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in Tables 1-16, such as antisense polynucleotides 

10 or ribozymes, will be introduced into cells, in vitro or in vivo. The present invention provides 
methods, reagents, vectors, and cells useful for expression of prostate cancer-associated 
polypeptides and nucleic acids using in vitro (cell-free), ex vivo or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 

15 expression of a protein or nucleic acid is application specific. Many procedures for 

introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
plasma vectors, viral vectors and any of the other well known methods for introducing cloned 
genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, 

20 e.g., Berger & Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology 
volume 152 (Berger), Ausubel etaL, eds., Current Protocols (supplemented through 1999), 
and Sambrook et a/., Molecular Cloning - A Laboratory Manual (2nd ed., Vol. 1-3, 1989. 

In a preferred embodiment, prostate cancer proteins and modulators are 
administered as therapeutic agents, and can be formulated as outlined above. Similarly, 

25 prostate cancer genes (including both the full-length sequence, partial sequences, or 

regulatory sequences of the prostate cancer coding regions) can be administered in a gene 
therapy application. These prostate cancer genes can include antisense applications, either as 
gene therapy (i.e. for incorporation into the genome) or as antisense compositions, as will be 
appreciated by those in the art 

30 Prostate cancer polypeptides and polynucleotides can also be administered as 

vaccine compositions to stimulate HTL, CTL and antibody responses.. Such vaccine 
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compositions can include, e.g., lipidated peptides {see, e.g.,Vitiello, A. et al., J. Clin. Invest. 
95:341 (1995)), peptide compositions encapsulated in poly(DL-lactide-co-glycolide) ("PLG") 
microspheres (see, e.g. t Eldridge, et al., Molec. Immunol 28:287-294, (1991); Alonso etal., 
Vaccine 12:299-306 (1994); Jones et al, Vaccine 13:675-681 (1995)), peptide compositions 
5 contained in immune stimulating complexes (ISCOMS) (see, e.g., Takahashi et ah, Nature 
344:873-875 (1990); Hu et al., Clin Exp Immunol 113:235-243 (1998)), multiple antigen 
peptide systems (MAPs) (see, e.g., Tarn, Proc. Natl Acad. Set U.SJl 85:5409-5413 (1988); 
Tarn, /. Immunol Methods 196:17-32 (1996)), peptides formulated as multivalent peptides; 
peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 

10 vectors (Perkus, et aL, In: Concepts in vaccine development (Kaufmann, ed., p. 379, 1996); 
Chakrabarti, et al., Nature 320:535 (1986); Hu et al , Nature 320:537 (1986); Kieny, et at, 
AIDS Bio/Technology 4:790 (1986); Top et al., J. Infect. Dis. 124:148 (1971); Chanda et al, 
Virology 175:535 (1990)), particles of viral or synthetic origin (see, e.g., Kofler etaL,J. 
Immunol. Methods. 192:25 (1996); Eldridge et al, Sent. Hematol 30:16 (1993); Falo et al, 

15 Nature Med 7:649 (1995)), adjuvants (Warren et al, Anmu Rev. Immunol 4:369 (1986); 
Gupta et al, Vaccine 11:293 (1993)), liposomes (Reddy et al, J. Immunol 148:1585 (1992); 
Rock, Immunol. Today 17:131 (1996)), or, naked or particle absorbed cDNA (Ulmer, et al, 
Science 259:1745 (1993); Robinson et al., Vaccine 11:957 (1993); Shiver et al, In: Concepts 
in vaccine development (Kaufmann, ed, p. 423, 1996); Cease & Berzofsky, Annu. Rev. 

20 Immunol 12:923 (1994) and Eldridge et al, Sem. Hematol 30:16 (1993)). Toxin-targeted 
delivery technologies, also known as receptor mediated targeting, such as those of Avant 
Immunotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a 
substance designed to protect the antigen from rapid catabolism, such as aluminum hydroxide 

25 or mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 
Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 
as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 
MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 
Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 

30 aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 
tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 

87 



WO 02/30268 PCT/US01/32045 



polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 
Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 

Vaccines can be administered as nucleic acid compositions wherein DNA or 
5 RNA encoding one or more of the polypeptides, or a fragment thereof, is administered to a 
patient This approach is described, for instance, in Wolff et. al., Science 247:1465 (1990) as 
well as U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; 
WO 98/04720; and in more detail below. Examples of DNA-based delivery technologies 
include "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, 

10 cationic lipid complexes, and particle-mediated ("gene gun") or pressure-mediated delivery 
(see, e.g., U.S. Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the 
invention can be expressed by viral or bacterial vectors. Examples of expression vectors 
include attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 

15 vaccinia virus, e.g., as a vector to express nucleotide sequences that encode prostate cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useful in immunization protocols are described in, e.g. 9 U.S. 
Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are 

20 described in Stover et al 9 Nature 351:456-460 (1991), A wide variety of other vectors useftd 
for therapeutic administration or immunization e.g. adeno and adeno-associated virus vectors, 
retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the like, 
will be apparent to those skilled in the art from the description herein (see, e.g., Shata et at, 
Mol Med Today 6:66-71 (2000); Shedlock et al., J Leukoc Biol 68:793-806 (2000); Hipp et 

25 a/., In Vivo 14:571-85 (2000)). 

Methods for the use of genes as DNA vaccines are well known, and include 
placing a prostate cancer gene or portion of a prostate cancer gene under the control of a 
regulatable promoter or a tissue-specific promoter for expression in a prostate cancer patient. 
Hie prostate cancer gene used for DNA vaccines can encode full-length prostate cancer 

30 proteins, but more preferably encodes portions of the prostate cancer proteins including 
peptides derived from the prostate cancer protein. In one embodiment, a patient is 
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immunized with a DNA vaccine comprising a plurality of nucleotide sequences derived from 
a prostate cancer gene. For example, prostate cancer-associated genes or sequence encoding 
subfragments of a prostate cancer protein are introduced into expression vectors and tested 
for their immunogenicity in the context of Class I MHC and an ability to generate cytotoxic T 

5 cell responses. This procedure provides for production of cytotoxic T cell responses against 
cells which present antigen, including intracellular epitopes. 

In a preferred embodiment, the DNA vaccines include a gene encoding an 
adjuvant molecule with the DNA vaccine. Such adjuvant molecules include cytokines that 
increase the immunogenic response to the prostate cancer polypeptide encoded by the DNA 

10 vaccine. Additional or alternative adjuvants are available. 

In another preferred embodiment prostate cancer genes find use in generating 
animal models of prostate cancer. When the prostate cancer gene identified is repressed or 
diminished in cancer tissue, gene therapy technology, e.g., wherein antisense RNA directed 
to the prostate cancer gene will also diminish or repress expression of the gene. Animal 

15 models of prostate cancer find use in screening for modulators of a prostate cancer-associated 
sequence or modulators of prostate cancer. Similarly, transgenic animal technology 
including gene knockout technology, e.g. as a result of homologous recombination with an 
appropriate gene targeting vector, will result in the absence or increased expression of the 
prostate cancer protein. When desired, tissue-specific expression or knockout of the prostate 

20 cancer protein may be necessary. 

It is also possible that the prostate cancer protein is overexpressed in prostate 
cancer. As such, transgenic animals can be generated that overexpress the prostate cancer 
protein. Depending on the desired expression level, promoters of various strengths can be 
employed to express the transgene. Also, the number of copies of the integrated transgene 

25 can be determined and compared for a determination of the expression level of the transgene. 
Animals generated by such methods find use as animal models of prostate cancer and are 
additionally useful in screening for modulators to treat prostate cancer. 

Kits for Use in Diagnostic and/or Prognostic Applications 

30 For use in diagnostic, research, and therapeutic applications suggested above, 

kits are also provided by the invention. In the diagnostic and research applications such kits 
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may include any or all of the following: assay reagents, buffers, prostate cancer-specific 
nucleic acids or antibodies, hybridization probes and/or primers, antisense polynucleotides, 
ribozymes, dominant negative prostate cancer polypeptides or polynucleotides, small 
molecules inhibitors of prostate cancer-associated sequences etc. A therapeutic product may 
5 include sterile saline or another phannaceuticaliy acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing directions 
(i.e., protocols) for the practice of the methods of this invention. While the instructional 
materials typically comprise written or printed materials they are not limited to such. Any 
medium capable of storing such instructions and communicating them to an end user is 
10 contemplated by this invention. Such media include, but are not limited to electronic storage 
media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the 
like. Such media may include addresses to internet sites that provide such instructional 
materials. 

The present invention also provides for kits for screening for modulators of 
15 prostate cancer-associated sequences. Such kits can be prepared from readily available 
materials and reagents. For example, such kits can comprise one or more of the following 
materials: a prostate cancer-associated polypeptide or polynucleotide, reaction tubes, and 
instructions for testing prostate cancer-associated activity. Optionally, the kit contains 
biologically active prostate cancer protein. A wide variety of kits and components can be 
20 prepared according to the present invention, depending upon the intended user of the kit and 
the particular needs of the user. Diagnosis would typically involve evaluation of a plurality 
of genes or products. The genes will be selected based on correlations with important 
parameters in disease which may be identified in historical or outcome data. 

25 
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EXAMPLES 

Example 1: Tissue Preparation, Labeling Chips, and Fingerprints 

5 Purifying total RNA from tissue sample using TRIzol Reagent 

The sample weight is first estimated The tissue samples are homogenized in 
1 ml of TRIzol per 50 mg of tissue using a homogenizer (e.g„ Polytron 3100). The size of 
the generator/probe used depends upon the sample amount. A generator that is too large for 
the amount of tissue to be homogenized will cause a loss of sample and lower RNA yield. A 
10 larger generator (e.g., 20 mm) is suitable for tissue samples weighing more than 0.6 g. Fill 
tubes should not be overfilled. If the working volume is greater than 2 ml and no greater than 
10 ml, a 15 ml polypropylene tube (Falcon 2059) is suitable for homogenization. 

Tissues should be kept frozen until homogenized. The TRIzol is added 
directly to the frozen tissue before homogenization. Following homogenization, the insoluble 
15 material is removed from the homogenate by centrifugation at 7500 x g for 15 min. in a 

Sorvall superspeed or 12,000 x g for 10 min. in an Eppendorf centrifuge at 4°C. The cleared 
homogenate is then transferred to a new tube(s). Samples may be frozen and stored at -60 to 

-70°C for at least one month or else continue with the purification. 

The next process is phase separation. The homogenized samples are incubated 

20 for 5 minutes at room temperature. Then, 0.2 ml of chloroform per 1ml of TRIzol reagent is 
added to the homogenization mixture. The tubes are securely capped and shaken vigorously 
by hand (do not vortex) for 15 seconds. The samples are then incubated at room temp, for 
2-3 minutes and next centrifiiged at 6500 ipm in a Sorvall superspeed for 30 min. at 4oC. 

The next process is RNA Precipitation. The aqueous phase is transferred to a 

25 fresh tube. The organic phase can be saved if isolation of DNA or protein is desired. Then 
0.5 ml of isopropyl alcohol is added per 1ml of TRIzol reagent used in the original 
homogenization. Then, the tubes are securely capped and inverted to mix. The samples arc 
then incubated at room temp, for 10 minutes an centrifiiged at 6500 rpm in Sorvall for 20 

min.at4°C. 
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The RNA is then washed The supernatant is poured off and the pellet washed 
with cold 75% ethanol. 1 ml of 75% ethanol is used per 1 ml of the TRIzol reagent used in 
the initial homogenization. The tubes are capped securely and inverted several times to 
loosen pellet without vortexing . They are next centrifuged at <8000 rpm (<7500 x g) for 5 
5 minutes at 4°C. 

The RNA wash is decanted. The pellet is carefully transferred to an 
Eppendorf tube (sliding down the tube into the new tube by use of a pipet tip to help guide it 
in if necessary). Tube(s) sizes for precipitating the RNA depending on the working volumes. 
Larger tubes may take too long to dry. Dry pellet. The RNA is then resuspended in an 
10 appropriate volume (e.g., 2 -5 ug/ul) of DEPC H 2 0. The absorbance is then measured. 

The poly A+ mRNA may next be purified from total RNA by other methods 

such as Qiagen* s RNeasy kit. The poly A + mRNA is purified from total RNA by adding the 
oligotex suspension which has been heated to 37°C and mixing prior to adding to RNA. 
The Elution Buffer is incubated at 70°C. If there is precipitate in the buffer, warm up the 2 x 

15 Binding Buffer at 65°C. The the total RNA is mixed with DEPC-treated water, 2 x Binding 
Buffer, and Oligotex according to Table 2 on page 16 of the Oligotex Handbook and next 
incubated for 3 minutes at 65°C and 10 minutes at room temperature. 

The preparation is centrifuged for 2 minutes at 14,000 to 18,000 g, preferably, 
at a "soft setting," The supernatant is removed without disturbing Oligotex pellet A little bit 

20 of solution can be left behind to reduce the loss of Oligotex. The supernatant is saved until 

satisfactory binding and elution of poly A + mRNA has been found. 

Then, the preparation is gendy resuspended in Wash Buffer OW2 and pipetted 
onto the spin column and centrifuged at full speed (soft setting if possible) for 1 minute. 

Next, the spin column is transferred to a new collection tube and gently 
25 resuspended in Wash Buffer OW2 and centrifuged as described herein. 

Then, the spin column is transferred to a new tube and eluted with 20 to 100 ul 
of preheated (70°C) Elution Buffer. The Oligotex resin is gently resuspended by pipetting up 
and down. The centrifiigation is repeated as above and the elution repeated with fresh elution 
buffer or first eluate to keep the elution volume low. 
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The absorbance is next read to determine the yield, using diluted Elution 
Buffer as the blank. 

Before proceeding with cDNA synthesis, the mRNA is precipitated before 
proceeding with cDNA synthesis, as components leftover or in the Elution Buffer from the 
5 Oligotex purification procedure will inhibit downstream enzymatic reactions of the mRNA. 
0.4 vol. of 7.5 M NH40Ac + 2.5 vol. of cold 100% ethanol is added and the preparation 
precipitated at -2Q°C 1 hour to overnight (or 20-30 min. at -70°C), and centrifuged at 
14,000-16,000 x g for 30 minutes at 4°C. Next, the pellet is washed with 0.5 ml of 80% 
ethanol (-20°C) and then centrifuged at 14,000-16,000 x g for 5 minutes at room temperature. 
10 Hie80% ethanol wash is then repeated. The last bit of ethanol from the pellet is then dried 
without use of a speed vacuum and the pellet is then resuspended in DEPC H 2 0 at lug/ul 
concentration. 

Alternatively the RNA may be purified using other methods (e.g., Oiagen's RNeasv kit). 

15 No more than 100 ug is added to the RNeasy column. The sample volume is 

adjusted to 100 ul with RNase-fiee water. 350 ul Buffer RLT and then 250 ul ethanol 
(100%) are added to the sample. Hie preparation is then mixed by pipetting and applied to an 
RNeasy mini spin column for centrifugation (15 sec at >10,000 rpm). If yield is low, reapply 
the flowthrough to the column and centrifuge again. 

20 Then, transfer column to a new 2 ml collection tube and add 500 ul Buffer 

RPE and centrifuge for 15 sec at >10,000 rpm. Hie flowthrough is discarded. 500 ul Buffer 
RPE and is then added and the preparation is centriuged for 15 sec at >10,000 rpm. The 
flowthrough is discarded, and the column membrane dried by centrifuging for 2 min at 
maximum speed The column is transferred to a new 1.5 -ml collection tube. 30-50 ul of 

25 RNase-free water is applied directly onto column membrane. The column is then centrifuged 
for 1 min at >10,000 rpm and the elution step repeated. 

The absorbance is then read to determine yield. If necessary, the material may 
be ethanol precipitated with ammonium acetate and 2.5X volume 100% ethanol. 

30 First Strand cDNA Synthesis 
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The first strand can be make using using Gibco's "Superscript Choice System 
for cDNA Synthesis" kit The starting material is 5 ug of total RNA or 1 ug of polyA+ 
mRNAl. For total RNA, 2 ul of Superscript RT is used; for polyA+ mRNA, 1 ul of 
Superscript RT is used. The final volume of first strand synthesis mix is 20 ul. The RNA 
5 should be in a volume no greater than 10 ul. The RNA is incubated with 1 ul of 100 pmol 
T7-T24 oligo for 10 min at 70°C followed by addition on ice of 7 ul of: 4ul 5X 1 st Strand 
Buffer, 2 ul of 0. 1M DTT, and 1 ul of lOmM dNTP mix. The preparation is then incubated at 
37°C for 2 min before addition of the Superscript RT followed by incubation at 37°C for 1 
hour. 

10 

Second Strand Synthesis 

For the second strand synthesis, place 1st strand reactions on ice and add: 91 
ul DEPC H 2 0; 30 ul 5X 2nd Strand Buffer; 3 ul lOmM dNTP mix; 1 ul 10 U/ul Rcoli DNA 
Ligase; 4 ul 10 U/ul E.coli DNA Polymerase; and 1 ul 2 U/ul RNase H. Mix and incubate 2 
15 houis at 16°C. Add 2 ul T4 DNA Polymerase. Incubate 5 min at 16°C. Add 10 ul of 0.5M 
EDTA. 

Cleaning up cDNA 

The cDNA is purified using Phenol:Chloroform:Isoamyl Alcohol (25:24:1) 

20 and Phase-Lock gel tubes. The PLG tubes are centrifuged for 30 sec at maximum speed. 
The cDNA mix is then transferred to PLG tube. An equal volume of 
phenol:chloroform:isamyl alcohol is then added, the preparation shaken vigorously (no 
vortexing), and centrifuged for 5 minutes at maximum speed The top aqueous solution is 
transferred to a new tube and ethanol precipitated by adding 7.5X 5M NH40Ac and 2.5X 

25 volume of 100% ethanol. Next, it is centrifuged immediately at room temperature for 20 
min, maximum speed. Hie supernatant is removed, and the pellet washed with 2X with cold 
80% ethanol. As much ethanol wash as possible should be removed before air drying the 
pellet; and resuspending it in 3 ul RNase-free water. 
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In vitro Transcription (TVT) and labeling with biotin 

In vitro Transcription (IVT) and labeling with biotin is performed as follows: 
Pipet 1.5 ul of cDNA into a thin-wall PCR tube. Make NTP labeling mix by combining 2 ul 
T7 lOxATP (75 mM) (Ambion); 2 ul T7 lOxGTP (75 mM) (Ambion); 1.5 ul T7 lOxCTP (75 
5 mM) (Ambion); L5 ul T7 lOxUTP (75 mM) (Ambion); 3.75 ul 10 mM Bio-1 1-UTP 
(Boehringer-Mannheim/Roche or Enzo); 3.75 ul 10 mM Bio-16-CTP (Enzo); 2 ul lOx T7 
transcription buffer (Ambion); and 2 ul lOx T7 enzyme mix (Ambion). The final volume is 
20 ul. Incubate 6 hours at 37°C in a PCR machine. The RNA can be furthered cleaned. 
Clean-up follows the previous instructions for RNeasy columns or Qiagen's RNeasy protocol 

10 handbook. The cRNA often needs to be ethanol precipitated by resuspension in a volume 
compatible with the fragmentation step. 

Fragmentation is performed as follows. 15 ug of labeled RNA is usually 
fragmented. Try to minimize the fragmentation reaction volume; a 10 ul volume is 
recommended but 20 ul is all right Do not go higher than 20 ul because the magnesium in 

15 the fragmentation buffer contributes to precipitation in the hybridization buffer. Fragment 
RNA by incubation at 94 C for 35 minutes in 1 x Fragmentation buffer (5 x Fragmentation 
buffer is 200 mM Tris-acetate, pH 8.1; 500 mM KOAc; 150 mM MgOAc). The labeled 
RNA transcript can be analyzed before and after fragmentation. Samples can be heated to 
65°C for 15 minutes and electrophoresed on 1% agarose/TBE gels to get an approximate idea 

20 of the transcript size range. 

For hybridization, 200 ul (10 ug cRNA) of a hybridization mix is put on the 
chip. If multiple hybridizations are to be done (such as cycling through a 5 chip set), then it 
is recommended that an initial hybridization mix of 300 ul or more be made. The 
hybridization mix is: fragment labeled RNA (50 ng/ul final cone); 50 pM 948-b control 

25 oligo; 1.5 pM BioB; 5 pM BioC; 25 pM BioD; 100 pM CRE; 0.1 mg/ml herring sperm DNA; 
0.5 mg/ml acetylated BSA; and 300 ul with lxMES hyb buffer. 

The hybridization reaction is conducted with non-biotinylated IVT (purified 
by RNeasy columns) (see example 1 for steps from tissue to IVT): The following mixture is 
prepared: 



95 



WO 02/30268 



PCT/US01/32045 



IVT antisense RNA; 4 \xg: pi 
Random Hexamers (1 |Xg/|Jtl): 4 pi 
H 2 0: Ml 

14 

5 Incubate the above 14 pi mixture at 70°C for 10 min.; then put on ice. 

The Reverse transcription procedure uses the following mixture: 
0.1MDTT: 3^ 
50XdNTPmix: 0.6 jud 

H 2 0: 2.4 Ml 

10 Cy3orCy5dUTP(lmM): 3 pi 

SS RT II (BRL): 1 pi 



16^1 

The above solution is added to the hybridization reaction and incubated for 30 min., 42°C. 
15 Then, 1 pi SSE is added and incubated for another hour before being placed on ice. 

The SOX dNTP mix contains 25mM of cold dATP, dCTP, and dGTP, lOmM 

of dTIP and is made by adding 25 pi each of lOOmM dATP, dCTP, and dGTP; 10 pi of 

lOOmM dTTP to 15 pi H 2 0. ] 

RNA degradation is performed as follows. Add 86 pi H20, 1.5 pi 1M NaOH/ 
20 2 mM EDTA and incubate at 65°C, 10 min.. For U-Con 30, 500 pi IE/sample spin at 7000 g 

for 10 min, save flow through for purification. For Qiagen purification, suspend u-con 

recovered material in 500 pi buffer PB and proceed using Qiagen protocol. For DNAse 

digestion, add 1 ul of 1/100 dilution of DNAse/30 ul Rx and incubate at 37°C for 15 min. 

Incubate at 5 min 95°C to denature the DNAse. 

25 ■ 

Sample preparation 

For sample preparation, add CoM DNA, 10 pi; SOX dNTPs, 1 pi; 20X SSC, 
2.3 pi; Na pyro phosphate, 7.5 pi; 10 mg/ml Herring sperm DNA; 1 ul of 1/10 dilution to 
21.8 final vol. Dry in speed vac. Resuspend in 15 pi H2O. Add 0.38 pi 10% SDS. Heat 
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95°C, 2 min and slow cool at room temp, for 20 min. Put on slide and hybridize overnight at 
64°C. Washing after the hybridization: 3X SSC/0.03% SDS: 2 min., 37.5 ml 20X 
SSC+0.75ml 10% SDS in 250ml H 2 0; IX SSC: 5 min., 12.5 ml 20X SSC in 250ml H 2 0; 
0.2X SSC: 5 min., 2.5 ml 20X SSC in 250ml H 2 0. Dry slides and scan at appropiate PMT's 
5 and channels. 

Example 2: Taxol resistant Xenograft Model of Human Prostate Cancer 

Treatment regimens that include paclitaxel (Taxol; Bristol-Myers Squibb 
10 Company, Princeton, NJ) have been particularly successful in treating hormone-refractory 
prostate cancer in the phase II setting (Smith et al., Semin. Oncol. 26(1 Suppl 2): 109-11 
(1999)). However, many patients develop tumors which are initially, or later become, 
resistant to taxol. To identify genes that may be involved with resistance to taxol, or are 
regulated in response to taxol resistance, and therefore may be used to treat, or identify, taxol 
15 resistance in patients, the following experiments were carried out. 

The androgen-independent human cell line CWR22R was grown as a 
xenograft in nude mice (Nagabhushan et al., Cancer Res. 56(13):3042-3046 (1996); Agus et 
al., J. Natl. Cancer Inst.91(21): 1869-1876 (1999); Bubendorf et al., J. Natl. Cancer Inst. 

20 91(20):1758-1764 (1999)). Initially, these xenograft tumors were sensitive to therapeutic 
doses of taxol. The mice were treated continuously with sub-therapeutic doses, and the 
tumors were allowed to grow for 3-4 weeks, before surgical removal of the tumors. The 
tumor from an individual mouse was then minced, and a small portion was then injected into 
a healthy nude mouse, establishing a second 

25 passage of the tumor. This mouse was then treated continuously with the 

same sub-therapeutic dose of taxol. This process was repeated 14 times, and a portion of 
each generation of xenograft tumor was collected. There was increasing resistance to 
therapeutic doses of taxol with each generation. Bythe end of the process, the tumors were 
fully resistant to therapeutic doses of taxol. RNA from each generation of tumor was then 

30 isolated, and individual mRNA species were quantified using a custom Affymetrix 

GeneChip® oligonucleotide microarray, with probes to interrogate approximately 35,000 
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unique mRNA transcripts. Genes were selected that showed a statistically significant up- 
regulation, or down-regulation, during the subsequent generations of increasingly taxol- 
resistant tumors. Only one gene was significantly up-regulated, whereas 24 genes were 
down-regulated; these are presented in Table 10. 
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The gene sequences identified to be overexpressed in prostate cancer may be 
used to identify coding regions from the public DNA database. The sequences may be used 
5 to either identify genes that encode known proteins, or they may be used to predict the coding 
regions from genomic DNA using exon prediction algorithms, such as FGENESH (Salamov 
and Solovyev, 2000, Genome Res. 10:516-522). In addition, one of ordinary skill in the art 
would understand how to obtain the unigene cluster identification and sequence information 
according to the exemplar accession numbers provided in Tables 1-16. (see, 
10 http://www.ncbi.nlm.nih.gov/UniGene/). 



15 
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TABLE1 : shows genes, including expression sequence tags, differentially expressed in 
prostate tumor tissue compared to normal tissue as analyzed using the Affymetrix/Eos HuOl 
GeneChip array. Shown are the relative amounts of each gene expressed in prostate tumor 
samples and various normal tissue samples showing the highest expression of the gene. 



Pkey. Unique Eos probeset identifier number 

ExAccru Exemplar Accession number, Genbank accession number 

UnigenelD: Unigerte number 

Unigene Tftie: Unigerte gene title 

Rt: Ratio of tumor to normal body tissue 
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104674 H&26289 AA009527 


ESTs 


6.4 


100727 H&3347B6 X07290 


Human HF.12 gene mRNA 


65 


130150 Hs 15113 AF000573 


homogentisats 1;2-o*toxygenase (homogenti 


65 


121770 H&27B428 AA421714 


Homo sapiens mRNA for KIAA0896 protein; 


65 


123475 Hs_25Q528 AA5992S7 


ESTs; Weakly similar to ANKYRW; BRAIN V 


65 


133061 H&266638 AB000584 


prostate Differentiation factor 


65 


116429 H&279923 AA6Q9710 


ESTs; Weakly similar to similar to GTP-o 


62 


101233 Hs578 L29008 


sorbitol dshydrjogenase 


62 


104691 H&37744 AA01117B 


ESTs 


62 


127246 AA325Q29 


EST27953 Cere be Bum II Homo sap fens cONA 


62 


127775 Hs.1 79902 H04106 


ESTs; WeaWy similar to (defllne not ava 


62 


105500 H&222399 AA256485 


ESTs 


6.1 


131463 Mc271d X74142 


forkhaad fDrosoohilaWikB 1 


- 6.1 


132116 Hs.40269 AA234767 


ESTs 


6 


130828 Hs.203213 AA053400 


ESTs 


55 


115357 Hs.72988 AA281783 


ESTs 


55 


105496 Hs501997 AA256323 


ESTs 


5.7 


116334 Hs.48948 AA491457 


ESTs 


5.7 


107968 Hs.61539 AA034020 


ESTs 


5.7 


120132 Hs.125019 Z38839 


ESTs; Weakly similar to IB! ALU SUBFAMl 


55 


106375 Hs289072 AA443993 


ESTs 


55 


132550 Hs.170185 AA029597 


bone morphogenetlc protein 7 (osteogenic 


55 


124777 Hs.140237 R41933 


ESTs; WeaWy similar to neuronal thread 


55 


100311 H&337616 D50540 


phosphodiesterase 38; cGMP-inhbfted 


55 


101791 Hs.62354 M83822 


Human beige-Eke protein (BGL) mRNA; par 


55 


117698 Ks.45107 N41002 


ESTs 


55 


132387 H&281434 R70914 


heat shock 70k0 protein 1 


55 


122041 Ks.98732 AA431407 


Homo sapiens Chromosome 16 BAG clone CtT 


55 


133723 H&262476 AA088851 


S^denosylrnethlonme decarboxylase 1 


55 
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113938 W81598 ESTs SA 

133015 Hs246315 AA047036 ESTs 5.4 

125745 Hs.75722 AI283493 ribophorinll 54 

107295 HS30120 T34527 UDP-N-acely1-aJpha-{Halactosamlne:pdyp 5.4 

5 108186 Hs.7780 AA056482 ESTs 53 

100184 H&21223 D 17408 calpontn 1; basic; smooth muscle 53 

104486 Hs.326392 N25110 Human guanine nucleotide exchange factor 53 

104033 H&98944 AA355031 ESTs 53 

110844 Hs. 167531 N31952 ESTs; WeaMy similar to (deffine not ava 53 

10 129056 Hs.108338 H70627 ESTs; Weakly similar to DH ALU SUBFAMI 53 

102805 H&25351 U 90304 iroquols-dass homeodomaln protein 53 

133493 Hs.194369 AA284143 Homo sapiens chromosome 1 atrophtn-1 rel 53 

129184 Hs.109201 W26769 ESTs; Highly similar to (defline not ava 52 

134158 Hs.79428 U15174 BCL2/aoenovirus E1B 18kWnteracting pro 52 

15 107240 Hs. 159372 D59368 ESTs 52 

104787 AAQ27317 ESTs; Weakly similar to DI! ALU SUBFAMI 52 

123527 Hs.1 08327 AA608679 damage-specific DNA binding protein 1 (1 52 

116646 Hs.194228 F03048 ESTs; Moderately similar to MI ALU SUB 52 

101448 Hs.195850 M21389 keratin 5 (epidermolysis bullosa simplex 5.1 

20 116188 Hs.1 84598 AA464726 ESTs; Weakly simSar to Ml ALU SUBFAMI 5.1 

126259 HS281428 Z21472 ESTs; Moderately similar to 1111 ALU SUB 5.1 

105921 Hs.169119 AA402613 ESTs 5.1 

103375 H&54416 X91668 sine ocuEs homeobox (Drosophfla) homoto 5.1 

128871 Hs.1 06778 AA400271 ESTs; Highly similar to (deffine not ava 5.1 

25 112681 Hs.148932 RB7331 ESTs; Moderately similar to semaphonn V 5.1 

105784 H&226434 AA350771 ESTs 5.1 

116238 Hs.47144 AA479362 ESTs 5 

102913 HS30342 X07696 keratin 15 5 

103011 HS326035 X52541 early growth response 1 5 

30 126023 H58881 yr36d09.r1 Soares fetal fiver spleen 1 NF 5 

103709 Hs.13804 AA037316 ESTs 5 

118981 HS39288 N93839 ESTs; WeaWy similar to III! ALU SUBFAMI 5 

134807 H&B9732 X78932 zinc finger protein 273 5 

100079 H&23311 AB002365 Human mRNA tor KIAA0367 garte; partial cd 4.9 

35 132047 Hs.3796 D83492 EphBS 43 

132880 Hs.177537 AA444369 ESTs 43 

124049 Hs.74519 F10523 primase; polypeptide 2A(58kD) 43 

133330 Hs.71119 U42360 Human N33 mRNA; complete cds 43 

104776 AA026349 ESTs 43 

40 122593 Hs.128749 AA453310 Homo sapiens atpra-ftiethylacyKJoA racema 43 

103912 Hs.143087 AA251078 ESTs 43 

113981 Hs26009 W86307 Homo sapiens mRNA for K1AA0860 protein; 43 

105288 HS3585 AA233168 ESTs; WeaWy similar to coded for by C. 43 

135035 H&284186 H89575 ESTs 43 

45 104144 Hs.183390 AA447439 ESTs; WeaWy similar to ZINC FINGER PROT 43 

129389 H&288126 AA621604 ESTs 43 

125982 R98091 RAE1 (RNA export 1 ; S.pombe) homolog 43 

125162 Hs26243 W44682 ESTs 43 

103023 Hs.1 17850 X53793 mufltactional polypeptide similar to S 4.7 

50 129735 W80701 ESTs; WeaWy similar to HERV-E envelope 4.7 

104479 Hs.106390 N36040 ESTs 4.7 

103731 AA070545 zm7c3j1 Straiagene neuroepithelium (#93 4.7 

126575 Hs.127602 W72416 ESTs - 4.7 

124578 H&231500 N68321 Human glucose transporter-fiJce protein-l 4.7 

55 130617 Hs.1674 M90516 gliitamlrie-mjuaose^hosphata transamln 4.7 

116752 Hs.91622 H06373 Homo sapiens clone 24456 mRNA sequence 4.7 

100279 Hs.82007 D42084 Human mRNA for K1AA0094 gene; partial cd 4.7 

126288 Hs.89576 A1479264 ESTs 4.7 

131836 Hs.32990 AA610086 ESTs 4.7 

60 106717 HS239489 AA465093 T1A1 cytotoxic granule-assoc&iBd RNA-bi 4.7 

114542 Hs31011 AA055768 ESTs 43 

103806 AA130614 zd1£j1 Stratagena rteuroaplihetaNT2R 43 

130529 AA173238 smafl Inducible cytokine A5 (RANTES) 43 

115675 HS32065 AA406546 ESTs 43 

65 111386 H&293798 N95326 ESTs 43 

106503 H&29679 AA452411 ESTs 43 

119943 Hs.14158 W86835 copinell) 43 

104459 Hs.1 00070 M91493 EST 43 

100774 HSJ69603 HG371-HT1063 Mucin 1 f Epithelial, AlLSpQce6 43 
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100652 Hs.142653 HQ28254fT2949 Ret Transforming Ger» 4j6 

132015 HS3731 D11800 ESTs 4j6. 

126086 H70975 yi73g01ii Soares fetal Gver spleen 1NF 4,6 

130888 Hs.173094 F03819 ESTs 4.6 

5 106390 Hs.20166 AA446964 Prostate stem cell antigen 4.6 

126959 AA199853 ESTs; Moderately simitar to !1J] ALU SUB AS 

131584 Hs£9117 X91648 Rsaplens mRNA for pur ajpha extended 3 AS 

104638 H&20953 AA039481 ESTs AS 

125661 R50319 ESTs 4-5 

10 103171 Hs234728 X68733 alpha-l-antichymotrypsin AS 

103928 Hs.199160 AA280085 ESTs 4.5 

102899 Hs.75730 X06272 signal recognition particle receptor fd 4.5 

100892 Hs.180789 HG4557-HT4962 Small Nuclear Ribonudeoprotein Ut, 1snr AS 

106167 H&7956 AA425906 ESTs AS 

IS 129404 H&317584 AA172056 ESTs 4.5 

106990 H&24758 AA521354 ESTs 4.5 

132316 Hs.44566 U28831 Human protein immuno-reactive with anti- 44 

132056 Hs.38176 TB9386 Homo sapiens mRNA for KIAA0606 protein; 4.4 

133718 Hs.1 93760 X 15306 neurofilament; heavy polypeptide (200kD) 4.4 

20 101470 Hs.1 846 M22898 tumor protein p53 {U-Fraumenl syndrome) AA 

131904 H&284296 AA143019 ESTs; Highly sim2ar to surface 4 irttegr 4.4 

105804 H&22514 AA383142 ESTs AA 

122861 Hs.1 19394 AA464428 ESTs AA 

111336 H&29894 N79565 ESTs AA 

25 121944 Hs.98518 AA429278 ESTs AA 

134401 H&211577 AA243746 ESTs; Highly similar to CGI protein [H.s AA 

126458 H&288969 AA815252 ESTs; Weakly similar to 111! ALU SUBFAMI 44 

133435 Hs323968 T23983 ESTs; Moderately similar to IHJ ALU SUB AA 

105178 H&21941 AA187490 ESTs 4.3 

30 127315 AA640834 nr27b08 jI NCLCGAP_Pr3 Homo sapiens cDN 4.3 

132645 Hs.54424 X87870 Helens mRNA for hepatocyte nuclear fa 4.3 

116162 H&282990 AA461487 ESTs; Weakly similar to F52C12^ [C.eleg A3 

118040 Hs.47567 N52876 EST 4.3 

130008 H&278427 M31423 cerebellar oegeneratlon-related protein 4.3 

35 126607 Hs.114668 W87424 ESTs 4.3 

123061 Hs.105130 AA482030 EST 4.3 

109391 Hs.1 84245 AA219699 ESTs 4^ 

109175 AA180496 ESTs 4.3 

127003 Hs.173540 AA550806 ESTs; Weakly similar to (defOne not ava A3 

40 102547 Hs.46638 U57911 chromosome 11 open reading frame 8 AS 

134208 Hs.79993 U88871 peroxisomal biogenesis factor 7 AS 

104258 H&5462 AF007216 solute carrier family 4; sodium bicarbon AS 

130759 Hs.18946 AA094720 ESTs; Weakly similar to (data not ava A3 

132160 H&295923 AA281770 seven in absentia (DrosophHa) homolog 1 A3 

45 135062 H&93872 AA174183 ESTs 4.3 

126510 Hs.334762 R49702 ESTs; WesJdy similar to WAA031 9 [Hjsap! A2 

122055 H&9S747 AA431732 EST 42 

133138 Hs.6574 AF007165 suppressln (nuclear deformed epidermal a A2 

109890 H&20843 H04649 ESTs A2 

50 133294 Hs.69997 R79723 H^aplens mRNA for transBn associated z A2 

134436 Hs.83190 S80437 fatty acid synthase {J region} [human, A2 

107375 H&251064 U88573 NBR2 A2 
\22323 H&27413 AA436158 ESTs - A2 
103044 Hs248210 X55777 H.sapiens Mahlavu hepatoceHufer carcino A2 

55 120125 H&59815 W99362 ■ EST A2 

128969 H&283978 T65327 ESTs; Highly sbnllar to (dsffine not ava A2 

129637 Hs.1 179 D90359 TATA box binding protein (TBP)-associate A2 

106566 AA455921 ESTs; WeaWy similar to Gil ALU SUBFAMI A2 

112605 HS298S2 R79220 ESTs A2 

60 103364 H&279929 X90872 H.sapiens mRNA for gp25L2 protein A2 

132811 Hs57419 U25435 transcriptional repressor A2 

126570 H&328292 T79274 ESTs A2 

116298 H&94109 AA489046 ESTs A2 

103024 Hs.105938 X53961 tactotransferrin 4.1 

65 129133 Hs.1 08850 R56728 yg95c6 j1 Soares infant brain 1MB Homo 4.1 

133167 HsX641 N98707 Wnesin family member 5C 4.1 

126871 Hs.14051 AA351779 ESTs 4.1 

132333 Hs.45032 AA192157 ESTs 4.1 

107376 H&327179 U90545 solute canter family 17 (sodium phospha 4.1 
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17280 
00234 
00959 
07130 
05035 
26735 
13056 
02460 



123107 
127256 



15504 
20726 
03576 
27889 
06394 
28046 
03391 
06448 
26513 



Hs.100861 

Hs.1 16774 

H&24183 

H&26369 

Hs.181889 

Hs.172129 

H&3085 

Hs.118127 

Hs.12913 



10151 
05344 
W791 
23442 
27800 
14555 
22138 
29565 
03471 
33908 
05635 
134285 
34125 



00842 
04334 
10242 
125298 
04060 
105823 
26499 
30752 
23494 
04846 



15506 
100452 
04454 
108730 
31223 
04784 
04946 



01724 
106140 
128135 
1 20030 
I26457 
23917 
10714 
30577 



H&226795 

H&8036 

H&211562 

H&26813 

Hs.1 04207 

H&267967 

H&22862 

H&42736 

Hs.97293 

H&94560 

Hs.144941 

H&25320 

Hs.1 14368 

H&27004 

HSX6276 

Hs38314 

Hs31608 

Hs.8645 

Hs301871 

Hs.111498 

Hs.79428 

Hs.167904 

Hs.163960 

Hs.1 98726 

Ha.75216 

HJL325474 

Hs301985 

H&81086 

HS50421 

H&241493 

Hs.1 86600 

Hs.182183 

Hs.78771 

Hs.1 9978 

Hs289008 

Hs303193 

Hs£93960 

Hs.1 10445 

Hs.18895 

Hs.112110 

Hs32478 

Hs.71721 

Hs.45207 

H&241552 

Hs.1 29228 

Hs.102859 

Hi24427 

H&269228 

Hs.73848 

H&9394 

Hs.620 

Hs.14912 

H&269721 

H&S8694 



AA280617 

AA450324 

AA343514 

AA133237 

H29730 

N22107 

D29677 

J00073 

AA620582 

M128486 

AA808949 

T26471 



AA504631 
AA486071 
AA327550 
AA234561 
AA291848 



Z26317 

AI147408 

AA447223 

AAS73285 

X94453 

AA449455 

W27601 

AA487015 

H18836 

AA235303 

AA029046 

AA598803 

AA521047 

AA058594 

AA4355 49 

X77777 

Y00815 

M83218 

AA281508 

AA460012 

R38102 

AA418069 

AA018758 

HG2743-HT3926 

D82614 

H26417 



ESTs; WeaWy similar to p60 katanln [Ks 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs; Moderately sWr to HQ ALU SUB 

WAA0Q54 gene product 

actin; alpha; cardiac muscle 

ESTs; WeaWy similar to (defllne not ava 

ESTs 

glutathione S-transf erase pi 

ESTs; Moderately similar to 1HI ALU SUB 

Homo sapiens myosin Dght chain kinase ( 

ESTs; Weakly similar to (defime not ava 

ESTs 

ESTs; Weakly similar to UU ALU SUBFAM1 

ESTs 

ESTs 

ESTs 

desmogteln2 
ESTs 
ESTs 
ESTs 

pyrrollne^cafboxylats synthetase (slut 
ESTs 

ESTs; Moderately similar to (defllne not 

ESTs; Weakly similar to 111! ALU SUBFAMJ 

ESTs 

ESTs 

ESTs 

ESTs 

BCL2/adenovirus E1B 19kD-Interacting pro 

ESTs 

ESTs 



K&112869 

Hs.17752 

Hs.162 



AA397968 

AA398197 

AA315671 

D50927 

AA599786 

AA040154 

AA142913 

AA292537 

D87742 

M84443 

AA126254 

AA247788 

AA027055 

AA069549 

AA495926 

M69225 

AA424524 

AA913491 

W92051 

AA007489 

AA621311 



M35410 



protein tyrosine phosphatase; receptor t 

catdesmon 1 

ESTs 

solute carrier family 22 (organic cation 

WAA0203 gene product 

natural WEer-tumor recognition sequenc 

ESTs 

Caldesmon 1 , AH SpDce 6, Non-Muscle 

ESTs 

ESTs 

ESTs 

zt87a9/1 Soares_tesfe_NHT Homo sapiens 
ESTs 

ESTs; Moderately sim2ar to unknown [Mm 

K1AA0137 gene product 

ESTs 

ESTs 

ESTs 

ESTs 

Human mRNA for WAA0268 gene; partial cd 

gabctoMnase 2 

ESTs 

ESTs; Highly similar to (define not ava 

ESTs 

ESTs 

ESTs 

bullous pemphigoid antigen 1 (230/240kD) 
Homo sapiens mRNA tor K1AA02B6 gene; par 
ESTs 
ESTs 

zh98g04.fi ScaxesJela4Jver_.spteen_1KF 
EST 

Homo sapiens phosphafJoytsaifoe^cffic 
InsuHn-Uke growth factor binding prote 

104 



4.1 
4.1 
4.1 
4.1 
4.1 
4.1 
4.1 
4.1 
4.1 
4.1 
LI 



3.9 
3.9 

ag 

3.9 
3.9 
3.9 
33 
3.9 
3.9 

ag 

3.9 
3.9 
3.9 

ae 
ae 
ag 
ag 

3.9 
3.9 
3.9 
33 
33 
35 



33 
33 
3.8 

ae 

33 
3.8 
3.8 
3.8 
33 
3.8 
33 
33 
33 
33 
33 

a7 
a7 
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117667 Hs.44708 N39214 


ser-Thr protein kinase related to the my 


37 


126104 Hs.39712 N77278 


ESTs; Weakly similar to BONE/CARTILAGE P 


3.7 


100379 Hsl27S721 082060 


Homo sapiens mRNA for membrane protein w 


3.7 


115646 Hs.305971 AA404352 


ESTs 


37 


125792 Hs.1 93700 A10053B8 


ESTs; Moderately sfmBar to Hi! ALU SUB 


37 


102162 Hs.1592 U18291 


CDC 16 (cell division cycle 16; S. cerevi 


37 


126530 Hs.183475 AA504343 


ESTs; Moderator/ similar to Oil ALU SUB 


37 


119940 H&272531 W86779 


EST 


37 


110769 H&23837 N22222 


yw34bOB-s1 Morton Fetal Cochlea Homo sap 


37 


132914 Hs.60293 AA496037 


ESTs 


37 


113594 Hs.15683 T92030 


ESTs 


37 


103702 H&Z79952 AA027793 


ESTs; Highly similar to (defDne not ava 


37 


130780 Hs. 19347 AA248406 


ESTs 


37 


123288 H&291025 AA495836 


EST 


37 


120691 Ha 22380 AA291173 


ESTs 


37 


103153 Hs.75295 X66534 


guanylata cyclase 1 ; soluble; alpha 3 


37 


129201 Hs.109390 H19969 


ESTs 


37 


114798 H&54900 AA159181 


ESTs 


37 


126801 Ha.7337 AA512902 


ESTs 


37 


105503 Hs-31707 AA256616 


ESTs 


37 


104260 Hs.194283 AF008192 


Homo sapiens putative GR6 protein (GR6) 


37 


125980 Hs35699 R97219 


ESTs 


37 


123255 Hs.1 05273 AA490390 


ESTs 


3.6 


103862 Hs.6363 AA206625 


ESTs 


3,6 


100696 Hs.121686 HG3162-HT3339 

IVvwvU 1 lO* Ifc 1 WW | IW IVfc III VW0 


Transcription Factor lia 


3.6 


134917 Hs.166994 X87241 


FAT tumor suppressor (DrosophBs) homolo 


3.6 


103520 Y10511 


H.saptens mRNA for C0176 protein 


&6 


113778 H&3Q2738 W15263 


ESTs 


3.6 


101838 H&75511 M92934 

IV 1 WWW 1 MllW 1 t IlkWiiVVT 


connective tissue growth factor 


ae 


113702 T97307 

I lurUb iwfwwt 


ESTs; Moderately similar to fill ALU SUB 


3.6 


118201 Hs.48428 N59800 


EST 


3,6 


116519 Hs.68554 C20780 


EST 


3.6 


105866 H&22983 AA400517 


ESTs; Moderately similar to UDP-GLUCOSE: 


3.6 


106709 Hs.170291 AA464696 


ESTs 


3.6 


127858 H&27973 AA806365 


oc26h07j1 NCLCGAP_GCB1 Homo sapiens cD 


3.6 


101964 S81578 


dioxirHesponsive gene {putative potyade 


3.6 


105508 H&326416 AA256680 


ESTs 


3.6 


116844 Hs-337434 H64938 


ESTs 


3.6 


105372 Hs.1 42296 AA238481 


ESTs 


3.6 


100745 Hs.1 44630 HG3510-HT3704 


V-Erba Related Ear-3 Protein 


3.6 


127521 Hs.1 64018 AA809982 


ESTs 


3.6 


110758 H&274265 N21365 


talin 


3.6 


107307 Hs.44155 T52099 


creatine Kinase; mitochondrial 2 (sarcom 


3.6 


133200 Hs.1 83639 AA432248 

IWUmAt 1 *W» 1 WWVWv § m 1 nMrW 


ESTs 


3.6 


114774 Hs.1 84325 AA150043 


ESTs 


3.6 


120265 Hs.270696 AA173759 


ESTs; Moderately simSar to III! ALU SUB 


3.6 


134359 Hs.199067 M34309 


v-erb*b2 avian erythroblastic leukemia v 


3.6 


116250 H&44829 AA480975 


ESTs; Moderately similar to till ALU SUB 


3j6 


106313 Hs35841 AA436459 


nuclear factor I/X (CCAAT-bbcCng transc 


3.6 


131898 Hs.279780 N52232 


ESTs 


3.6 


133444 Hs.73793 M27281 


vascular endothelial growth factor 


3.6 


128232 Hs.334641 H06296 


ESTs 


3.6 


135357 Hs.79572 AA235803 

Iwwwf 1 Hfl VVlfr *^wwww 


ESTs 


. 35 


457951 AI369384 

"twfWWl * »»*'WwVT 


arylsutfatase D 


35 


108407 AA075519 


zm87h9.s1 Stratagene ovarian cancer (#93 


35 


126659 T16245 


8 disintegrin and meMoproteinase doma 


35 


104189 H&301804 AA485805 


ESTs 


35 


125956 Ks.129014 N53276 


ESTs 


35 


103026 Hs.78386 X54162 


Human mRNA for a 64 Kd autoantigen expre 


35 


133011 Hs.171921 AA042990 


soma domain; immunoglobulin domain (1g); 


35 


131379 H&26176 R49035 


ESTs 


35 


126742 Hs.169359 K64106 


yr57e06.il Scares fetai Ever spleen INF 


35 


105560 Hs306915 AA262783 


ESTs 


35 


118472 Hs.42179 N66818 


ESTs 


35 


105623 H&30127 AA280895 


ESTs; Highly similar to UO ALU SUBFAM1 


35 


120262 Hs.145807 AA172076 


ESTs; Moderately similar to fill ALU SUB 


35 


105027 H&26771 AA126472 


ESTs 


35 


130760 Hs.18953 AA126997 


phosphodiesterase 9A 


35 


117473 HS.155560 N30157 


ESTs 


35 
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127862 Hs.163191 
126995 Hs.189810 
119071 

103941 Hs.96593 
110721 HS31319 
'126586 Hs43086 
103106 Hs.1857 
116357 KS30797 
105309 Hs.4104 
130796 Hs.19525 
109101 H&52184 
103134 Hs£839 
131798 HS301449 
118535 HS49418 
102592 Hs.11223 
125905 Hs.6456 
109160 H&301997 
105327 Hs.211593 
106586 H&57787 
122635 

132413 H&260116 
131938 H&34956 
133871 Hs.182793 
107175 H&282503 
101188 Hs.184298 
126422 H&237658 
118475 

104558 Hs.88959 
128307 Hs.132005 
112254 H&25829 
125408 HS39578 
109834 Hs.175955 
130844 H&20191 
127143 Hs30843 
135309 Hs.42500 
125724 H&23597B 
127692 HS.187983 
116674 Hs.92127 
134700 Hs.8868 
114846 Hs.166196 
103649 Hs.155983 
134835 Hs.89925 
130568 Hs.16085 
111331 Hs.15978 
106036 Hs.10653 
130987 Hs.21893 
112814 Hs35828 
127815 H&255015 
100144 K8.75616 
101129 Hs.247992 
130874 H&20621 
106882 HSJ26994 
103855 Hs.302287 
125957 

114048 Ha.1 46085 
109826 Hs.75354 
125355 Hs.170098 
104182 Hs.143792 
100294 Hs.75454 
131688 H$30692 
116256 HsJ8201 
102034 Ha230 
130072 Hs.14658 
114615 H&159456 
128707 H&104105 



AA765305 
W26950 
R31180 
AA282978 
H97678 
AA011247 
X62C25 
AA504806 
AA233790 



AA167708 

X65724 

X86098 

N67968 

U62389 



AA179387 

AA234440 

AA456593 

AA454085 

AA132969 

AA283620 

AA454597 

AA621751 

120320 

H48518 

N66845 

R56678 

AI453794 

R51831 

N72353 

K00604 

D12122 

AA533553 



AA083407 

AI021912 

F04816 

AA481414 

AA234929 

Z70219 

L04569 

AA232535 

N78773 

AA412505 

R45698 

RS8192 

AA676009 

D13643 

L10405 

T0S287 

AA489009 

AA195179 

H45213 

W94613 

F137Q2 

R45630 

AA479990 

D49396 

U24153 

AA481256 

U05291 

R99606 

AA083812 

AA138474 



karyopherin (tmportin) beta 2 

ESTs; Weakly Mar to (defiine not ava 

ESTs 

tmnsoipt'on factor-ike 5 (baste hetbc 
EST 

Human DNA sequence from PAC 388M5 on chr 

ESTs 

ESTs 

ESTs 

ESTs 

phosphodiesterase 6Q; cGMP-spedfic; rod 

Homo sapiens done 23620 mRNA sequence 

ESTs 

ESTs 

ESTs 

Nome disease (pseudogfioma) 
adenovirus 5 El A binding protein 
ESTs 

Human putative cytosolic NADP-dependent 

chape ronin containing TCP1; subunit 2 (b 

ESTs 

ESTs 

ESTs 

EST 

metaOoprotsase 1 (pEtrifysIn family) 

ESTs 

ESTs 

ESTs; Weakly similar to KIAA0601 protein 
cydlrHtependent kinase 7 (homotog of Xe 
ESTs; Highly similar to apoDpoprotBin A 
ESTs; Weakly similar to 1111 ALU CLASS B 
ESTs; Weakly similar to lill ALU SUBFAMI 
ESTs 
ESTs 

yv37e12s1 Scares fetal fiver spleen 1NF 
ESTs 

seven in absentia (DrosophBa) homotog 2 
nJ68h04.s1 NCI_CGAP_Pr10 Homo sapiens cO 
ESTs 

stimulated trans-acfing factor (50 kDa) 

ESTs 

ESTs 

golgi SNAP receptor complex member 1 
ESTs 

Ksapiens mRNA for SUTR for unknown pro 

calcium channel; voltage-dependent; L ty 

ESTs; Highly similar to (defSne not ava 

ESTs 

ESTs 

ESTs 

ESTs 

ob93c10^1 NCLOGAP.GCB1 Homo sapiens cD 

KIAA0018 gene product 

Homo sapiens DNA binding protein for sur 

ESTs 

ESTs 

ESTs 

yo03b08 Jl Soares adult brain N2b5HB55Y 

ESTs 

ESTs 

ESTs; Highly similar to KIAA0372 [H.sapl 
ESTs; Weakly similar to glioma amplified 
Human mRNA for Apo1_Human (MER5(Aop1-Mou 
p21 (COKNIAHctivated kinase 2 
ESTs; Weakly simitar to (defflne not ava 



Human chromosome 5q13J done 5GB mRNA 
ESTs; Highly similar to (deflfoe not ava 
Meis (mouse) homotog 2 

106 



15 
33 
35 
33 
33 
35 
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34 
3.4 
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115046 Hs.1 90057 AA252668 


ESTs 


33 


125862 Hs31110 H 12034 


ESTs 


33 


135142 H&24192 R31679 


ESTs 


33 


103119 Hs2877 X63629 


cadherin 3; P-cadherin (placental) 


33 


104460 Hs.62604 M91504 


ESTs 


33 


100365 Hs.79284 078611 


mesoderm specific transcript (mouse) horn 


33 


131524 Hs301604 N39152 


ESTs 


33 


102165 Hs.159627 U18321 


Death associated protein 3 


33 


126966 Hs.1 82575 R38438 


solute carrier family 15 (H^peptlde tra 


33 


124839 Hs.140942 R55784 


ESTs 


33 


100709 Hs.1 00469 HG32644fT3441 


Af-6(Gb:U02478) 


33 


132967 Hs.61635 AA032221 


Homo sapiens BAC clone RG041D1 1 from 7q2 


33 


102927 Hs.55114 X12876 


keratin 18 


33 


132616 Hs283558 AA386264 


ESTs 


33 


125132 Hs. 129781 W15495 


ESTs 


33 


111225 Hs3t652 N68989 


ESTs 


33 


114956 HsJ7113 AA243681 


ESTs 


33 


122235 Ha.1 12227 AA436475 


ESTs 


33 


112325 Hs.1 2315 R56055 


ESTs 


33 


123360 Hs.1 78604 AA504784 


ESTs 


33 


105150 Hs 155995 AA169640 


Homo sapiens mRNA for K1AA0643 protein; 


33 


107391 Hs284294 W02877 


ESTs 


33 


113058 Ha 7569 T26893 


EST 


33 


134371 Hs£2318 S 69790 


Brush-1 


33 


125669 H&333256 R51308 


ESTs; Moderately similar to Ull ALU SUB 


33 


111506 H&2941Q5 R07726 


ESTs 


33 


122974 Hs.194215 AA478625 


ESTs 


33 


102369 Hs_2998B7 U 33840 


hspatocyte nuctear factor 3; alpha 


33 


120408 Hs.1 901 51 AA235045 


ESTs 


33 


117993 Hs.47402 N52039 


ESTs; Weakly similar to (111 ALU SUBFAMI 


33 


128586 Ha 11500 AA437118 


ESTs 


33 


128138 Hs.1 26494 AI200825 


ESTs 


33 


127265 AA332751 


EST37214 Embryo, 8 week I Homo sapiens c 


33 


107674 Hs 41143 AA0 11027 


Homo sapiens mRNA for Kl AA0581 protein; 


32 


104866 H&293691 AAD45342 


ESTs 


32 


103427 H&250655 X97303 


H .sapiens mRNA for Ptg-12 protein 


32 


132990 Hs334334 AA458761 


ESTs 


32 


127017 Hs-251946 AA740146 


ESTs 


32 


132313 Hs44481 U13220 


forkhead (DrosophOa)-0ke 6 


32 


106880 Hs.32425 AA488889 


ESTs 


32 


107039 Hs.169780 AA599751 


homoloaous to veast nitroosn Dermease (c 


32 


120870 H&2&2581 AA357172 


ESTs 


32 


107920 H&284207 AAQ27951 


ESTs 


32 


104165 Hs.1 051 16 AA459160 


EST 


32 


107012 Hs.63908 AA598745 


ESTs 


32 


103605 Hs.1 94657 Z354Q2 


hLsspisfls gene encoding E*c8dherin, exon 


32 


124006 Hs270018 D603Q2 


ESTs 


32 


101300 Hs.74137 L40391 

IUIUW ■ ft9i# *f Ivf tTVvv I 


Homo sapiens (clone 8153) mRNA fragment 


32 


101183 Hs.795 L19779 


H2A hfctone family; member 0 


32 


125596 R25698 


yg44h11j2 Scares infant brain 1 NIB Homo 


32 


127261 AA661567 


nu86b02.s1 NOjCGAP JW Homo sapiens cD 


32 


120090 Hb-59554 W94591 


ESTs 


32 


129393 Hs.1 66982 D13435 


phosphatldyOrtosttol grycan; class F 


32 


120923 Hs.97129 AA382283 


ESTs 


32 


118907 H&274256 N91003 


ESTs 


32 


111552 Hs.191185 R09411 


ESTs 


32 


104431 Hs.99913 J03019 


adrenergic; beta-1-; receptor 


32 


133551 K&278634 D63480 


Human mRNA tor K1AA01 46 gene; partial cd 


32 


131615 Hs.192603 D14533 


xeroderma pigmentosum; complementation g 


32 


126547 Hs.84072 U47732 


transmembrane 4 superfamlfy member 3 


32 


103172 Hs.1 16774 X68742 


integrtn; alpha 1 


32 


113867 Hs24095 W68845 


ESTs 


32 


133323 Hs.70937 Z83735 


H3 histone farrtfy; member K 


32 


111597 H&189716 R11499 


ESTs 


32 


121515 Hs.1 04696 AM12133 


ESTs 


32 


107445 Hs.6639 W28406 


ESTs 


32 


106887 Hs.334335 AA489091 


ESTs 


32 


123052 Hs.1 85766 AA481806 


ESTs 


32 


107072 Hs.130760 AA609113 


Homo sapiens mRNA; cDNA DKFZp586N0318 (f 


32 
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102214 H&32864 U23752 SRY (sexKietarminIng region Y>box 11 32 

123147 AA487961 ab11h6.s1 Straiagene lung (#93721) Homo 32 

125435 Hs272138 R00940 ye87g03.il Scares fetal livar spleen 1NF 32 

116246 Hs250646 AA479961 ESTs; Highly similar to ub^uifirworyug 32 

5 105169 Ha.180789 AA18Q321 Homo sapiens (clone S164) mRNA; 3* end o 32 

134001 Hs.78344 AF001548 myosin; heavy polypeptide 11; smoottimus 32 

124866 H&304389 RB8571 ESTs 32 

133205 Hs.67619 AA089559 Homo sapiens mRNA; chromosome 1 specific 32 

102986 Hs.182378 X17648 colony stimulating factor 1 (macrophage) 32 

10 101232 Hs242894 L28997 ADP-ribosylatlon factor-Bke 1 3.1 

132906 Hs234898 AA142857 ESTs; Highly similar to gemtnln [H^apte 3.1 

104281 Hs£669 C14290 ESTs 3.1 

123926 HS227833 AAB21348 ESTs; Highly similar to (deflme not ava 3.1 

134464 HS239720 N79354 ESTs; Weakfy similar to Rga [D jnelanogas 3.1 

IS 105322 Hs.16346 AA234100 ESTs 3.1 

100631 Hs.48332 HG2709-H72805 SenWTlweonine Kinase (65225431) 3.1 

130791 Hs.199263 AA2591Q2 ESTs; Highly similar to (defOne not ava 3.1 

131220 Hs.300855 R77200 ESTs 3.1 

113237 Hs.123642 T62857 ESTs 3.1 

20 125562 Hs.98968 AJ494372 ESTs 3.1 

134110 Hs.79136 U41060 Human breast cancer; estrogen regulated 3.1 

132393 Hs.47334 W85888 ESTs; Moderately similar to 111! ALU SUB 3.1 

107439 Hs296842 W27995 ESTs; Moderately similar to non-muscle m 3.1 

125863 Hs.40719 AA299096 Homo sapiens mRNA; cDNA DKFZp564M0916 (f 3.1 

25 105811 HS286192 AA394121 ESTs 3.1 

129284 Hs296141 AA104023 ESTs 3.1 

125321 Hs.178294 T86652 ESTs 3.1 

107332 Hs.183297 T87750 ESTs 3.1 

123570 Hs.109653 AA608955 ESTs 3.1 

30 100384 H&90800 D83646 matrix rnetaJloprateinase 16 (membrane-in 3.1 

109063 H&38972 AA161043 tetraspanl 3.1 

133284 Hs.182828 U09367 zinc finger protein 136 (clone pHZ-20) 3.1 

131839 Hs.33010 H80622 Homo sapiens mRNA for K1AA0633 protein; 3.1 

117606 H&44698 N35115 ESTs 3.1 

35 418998 Hs287849 F13215 ESTs 3.1 

125180 Hs.103120 W58344 ESTs 3.1 

100789 HG3893-HT4163 Phosphoglucomutase 1, AH Splice 3.1 

126017 Hs.159440 H60487 ESTs 3.1 

132452 Hs247324 AA005262 Homo sapiens DMA sequence from PAC 262D1 3.1 

40 129077 Hs.108479 H78886 ESTs 3.1 

126563 Hs.181368 W26247 U5 snRNP-specffic protein (220 kD);orth 3.1 

129650 Hs.1 18258 N52554 ESTs 3.1 

123465 AA599033 ESTs 3.1 

126486 Hs.1 52316 AA345339 EST51345 Gall bladder U Homo sapiens cD 3.1 

45 126460 Hs.167031 W01616 za36d05.fi Soares fetal Ever spleen 1NF 3.1 

118697 Hs.43234 N72094 ESTs 3.1 

103860 Hs-38057 AA203742 ESTs 3.1 

127968 Hs.124347 AA971439 ESTs 3.1 

124984 Hs223241 T47566 yb15c1U1 Straiagene placenta (#937225) 3.1 

50 103903 Hs.1 5220 AA249334 j312^eq.F Human fetal heart, Lambda ZAP 3.1 

106697 HS22242 AA463737 ESTs 3.1 

130892 HS20993 AA442604 ESTs; Weakly similar to Ydr374cp [S.cere 3 

114032 Hs35014 W92779 ESTs - 3 

128835 Hs.1 06390 W15528 ESTs 3 

55 103667 HS247815 Z80788 Haptens H4/I gene 3 

126264 Ha250614 N42897 yy13h06.r1 Soares melanocyte 2NbHM Homo 3 

132626 HS21275 D25755 ESTs 3 

131107 Hs.75354 N87590 ESTs 3 

126780 H&5811 R12421 ESTs 3 

60 127363 Ks22116 AA307744 Homo sapiens Cdc1481 phosphatase mRNA; c 3 

103690 Hs.84063 AA016186 ESTs 3 

102589 Hs.8867 U62015 Homo sapiens Cyr61 mRNA, comptete cds 3 

125144 Hs24336 W37999 ESTs 3 

132977 Hs.301404 U28686 RNA binding motif protein 3 3 

65 120714 Hs.146170 AA292689 ESTs 3 

101038 Hs.79411 J05249 replication protein A2 (32kD) 3 

102856 HS248177 X00090 Human histone H3 gene 3 

105516 Hs.30738 AA257971 ESTs 3 

131137 H&33287 U85193 nuclear factor t/B 3 
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127221 H&241551 A1354332 ESTs 3 

411888 H&24104 R26708 ESTs 3 

131684 Hs3068 U26174 granzyme K (serine protease; granzyme 3; 3 

100629 Hs*1291 HG270&-HT2B02 Sarins/Threonine Kinase (Gb:225428) 3 

5 119344 Hs-58915 W86838 EST 3 

113801 H&118281 W38418 zinc finger protein 266 3 

133780 Hs.76152 M14219 decorin 3 

104690 Hs.1 4449 AA010889 ESTs 3 

126371 HS304139 N57645 EST 3 

10 127635 Hs.1 16346 AA766903 ESTs 3 

128434 Hs.1 43880 A1190914 ESTs 3 

435761 Hs.187555 AA701941 ESTs 3 

125025 H&50748 T71561 ESTs 3 

124940 Hs.103804 R99599 heterogeneous nuclear ribonucleoproteln 3 

15 128742 Hs251531 D00763 proteasome (prosoma; macropain) subunit; 3 

107147 Hs.10450 AA621125 Horrosepterwchro(TK)S(jrne2;10repeatreg 3 

112068 K&22545 R43910 ESTs 3 

105346 Hjl263727 AA235465 ESTs; Moderately simBar to IW ALU SUB 3 

130972 HS21739 AA370302 Homo sapiens mRNA; cDNA DKFZp586M 51 8 if 3 

20 131230 H&274407 AA149987 thymus spscffc serine peptidase 3 

133743 Hs.75847 N79435 ESTs 3 

127402 Hs227949 AA358869 ESTs; Highly similar to SEC13-RELATED PR 3 

117483 Hs.44189 N30426 ESTs 3 

123659 Hs.1 12699 AA609368 ESTs 3 

25 103963 Hs53290 AA298588 EST114219 HSC 172 cells U Homo sapiens c 3 

103795 Hs.7367 AA112222 ESTs; Moderately sfmflar to (defline not 3 

115092 HS30975 AA255903 CD39-like4 25 

134831 HS59B90 S72370 pyruvate carboxylase 25 

128579 Hs.101810 AA093378 ESTs; Weakly similar to Oil ALU SUBFAMI 25 

30 134193 Hs.7880 FO9570 ESTs 25 

123522 Hs.1 12575 AA608577 ESTs 25 

107109 H&32793 AA609943 ESTs 25 

134694 H&88556 D50405 historte deacetylase 1 25 

134399 Hs52689 H99801 tumor rejec&n antigen (gp96) 1 25 

35 134632 Hs.174139 AA398710 H. sapiens RNA for CLCN3 25 

106683 H&14512 AA461495 ESTs 25 

108555 AA084963 zn13e1ls1 Stratagene hNT neuron (#93723 25 

100953 H&2110 HG945-HT945 Nucleic Acid-Binding Protein (Gbl1 2693) 25 

130597 Hs.16492 AA173998 ESTs; WeaWy similar to weaWy similar t 25 

40 101813 Hs.139226 M8733B replication factor C(aciivator 1)2 (40 25 

106638 H&288 AA459950 ESTs 25 

129109 Hs.108708 AA491295 caldumTcalmoduIIivdepenrJent protein Wn 25 

125819 HS251871 AA044840 stromal cell-derived factor 1 25 

106282 Hs5857 AA433946 ESTs; Weakly simDar to (defCne not ava 25 

45 100388 H&301636 083703 peroxisomal biogenesis factor 6 25 

114546 HS58074 AA056263 ESTs; Moderately similar tollfl ALU SUB 25 

105914 Hs5701 AA402224 Homo sapiens growth arrest and DNA-damag 25 

108552 AA084912 zn11c7.s1 Stratagens hNT neuron (#937233 25 

126505 Hs.190057 W26894 16a1 1 Human retina cDNA randomly primed 25 

50 134098 Hs.79086 X06323 Human MRL3 mRNA for ribosomal protein L3 25 

129721 HS211539 L19161 eukaryotlc translation initiation factor 25 

100076 HS277422 AB000897 Homo sapiens mRNA far cadherin RB3, par 2.9 

117466 Hs.44104 N29862 ESTs - 25 

106335 Hs.36688 AA437258 ESTs; Moderately similar to WAP four-dis 25 

55 134510 Hs.250870 U25265 protein kinase; rnftogen-activated; Unas 2.9 

105835 HS52995 AA398412 ESTs 25 

106611 HS26267 AA458904 ESTs; WeaWy similar to torsInA [Rsapie 25 

134087 Ha.173824 U51166 mymlne^NA glycosylate 25 

100641 Hs.182183 HG2743-HT2846 Caldesmon 1.AR. Splice 4, Non-Muscle 25 

60 104602 R36920 ESTs Z9 

117203 Hjl42738 H99799 ESTs 25 

131889 HS34073 AA401912 BH-protocadherln (brain-hearf) 25 

101707 Hs.155212 M65131 methyimaiorryi Coenzyme A mutase 25 

115271 Hs£724 AA279422 ESTs 25 

65 125812 HS287912 H73420 lectin; rnaniwse-binding; 1 25 

110740 Hs.19762 H99675 ESTs 25 

103406 H&285728 X95677 Haptens mRNA for ArgBPIB protein 25 

• 104577 Hs.132390 R71539 ESTs 25 

102772 Hs.161002 U83115 absent in melanoma 1 25 
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131710 H&30985 AA233225 
125231 H&268903 W84714 
127380 Hs.15535 AI417137 
104229 Ks.61289 AB002346 
5 126600 Hs.191385 AA699949 
125175 Hs.303030 W52355 
103849 Hs.34576 AA187045 
102126 Hs.78961 U14575 
124906 Hs.107815 R87647 

10 131148 H$5Q3125 C00038 
123158 Hs218329 AA488658 
133667 Hs.75462 U72649 
105182 Hs.18271 AA191014 
133968 H&232068 D 15050 

IS 117425 HS336901 N27154 
111087 H&37837 N59545 
129641 Hs.11805 N66066 
128639 Hs.102897 N91248 
133209 Hs.79265 AA114183 

20 135154 H&267812 AA126433 
126838 Hs^79609 AA858097 
103803 H&106149 AA127698 
102139 HS2128 U15932 
128104 AA971000 

25 127834 Hs.337631 AA761415 
133101 Ks.180952 AA488230 
127250 Hs£17916 AI023717 
135063 HSJ93883 D10537 
126323 Hs.68644 N45014 

30 121873 Hs.145696 AA426270 
122090 HsmS4 AA432141 
118728 HS522645 N73705 
135400 Hs59915 M23263 
125278 Hs.129998 W93523 

35 124387 Hs.109019 N27637 
124803 Hs.12186 R45480 
H45968 H&32149 H45958 
104261 H&54Q9 AF008442 
105366 H&282093 AA236356 

40 106070 H&5957 AA417781 
131356 H&25960 M13241 
112009 H&26255 R42714 
133199 H&250175 AA609773 
110379 Hs53130 H44825 

45 103890 Hs.72085 AA236843 
128152 R20353 
107008 HS23740 AA598710 
135243 H&97101 AA215333 
103058 H&184510 X57348 

50 132020 H&293845 AA428990 
116354 H&292566 AA504262 
125887 Hs.12372 H98141 
120603 Hs.98541 AA282787 
115119 Hs.46847 AA256524 

55 133865 Hs.170290 F09315 

109415 Ha.110826 AA227219 
128657 HJL23767 Z38910 
109984 HS.1Q299 K09594 
133179 HS56731 U81599 

60 115998 H&336629 AA448488 
112180 H&25067 R49116 
120428 Hs.173694 AA236822 
106241 Hs.6019 AA430108 
131060 H&22564 AA160890 

65 111383 Hs.40919 N94527 
102123 Hs.1594 U14518 
102722 Hs.79981 U79242 
129887 H&274324 W92041 
126663 Hs.181297 AA714635 



ESTs; Highly slmOar to (defEne not ava 25 

ESTs 25 

Homo sapiens clone 24582 mRMA sequence 25 

inositol phosphate phosphatase 2 (syn 25 

ESTs 25 

EST 25 

ESTs; Weakly sim2ar to Ml ALU SUBFAM1 25 

protein phosphatase 1; regulatory (inhlb 25 

ESTs 25 

ESTs 25 

heat shock 70kO protein 1 25 

Human BTQ2 (BTQ2) mRNA; complete cds 25 

ESTs; WeaWy similar to Ydr372cp [Sxere 25 

Human mRNA for transcription factor AREB 25 

ESTs 25 

ESTs 25 

ESTs 25 

ESTs 25 

ESTs; Moderately slmOar to gtutamate py 25 

sorting nexin 4 25 

pigment epHhefium-derived factor 25 

ESTs 25 

dual specificity phosphatase 5 25 

op67g1 1 jbI SoaresJU=k.T_GBC..S1 Homo sapi 25 
nz22d08.s1 NCLCGAP.GCB1 Homo sapiens cD 2.8 

ESTs 25 

ESTs 25 

myelin protein zero (Charcot-Marie-Tooth 25 

yy80g06.r1 SoajesjnuItipfejscleroslsJNb 25 

ESTs 25 

ESTs 25 

ESTs 2.8 

androgen receptor (dihydrotBStostarone r 2.8 

ESTs 25 

ESTs 2.8 

cycflnK 2.8 

ESTs 25 

RNA polymerase I subunit 25 

ESTs 2.8 

Homo sapiens done 24416 mRNA sequence 25 

wnyc avian im/docytan»tosis viral rela! 25 

EST 25 

Homo sapiens done 23904 mRNA sequence 25 

ESTs 2.8 

ESTs; Weakly simSar to unknown [S.cerev 2.8 

yg20f10/1 Scares Infant brain 1NS Homo 2.8 

ESTs 25 

ESTs 25 

stratifln 25 

ESTs 25 

ESTs 25 

ESTs 25 
ESTs; Highly similar to (deffine not ava - 25 

Human DNA sequence from done30M3 on ch 25 

discs; large (DrosophBa) homolog 5 25 

Homo sapiens CAGF9 mRNA; partial cds 25 

ESTs 25 

ESTs; Moderately similar to HI! ALU SUB 25 

homeoboxB13 25 

ESTs; Weakly similar to zinc finger prot 25 

EST 25 

ESTs; Moderately stmQar to (deflate not 25 

ESTs 25 

myosin VI 25 

ESTs 25 

centromere protein A (1 7kD) 25 

Human done 23560 mRNA sequence 25 

PCAF asscdatsd factor 65 alpha 25 

ESTs 25 
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104367 Hs.134342 H 17438 ESTs; Weakly similar to sevarrtjansmembra 23 

107316 Hs.193700 T63174 ESTs; Moderately similar to Ml ALU SUB 2* 

128059 Hs. 145098 AA972446 ESTs 23 

124447 N48000 ESTs 23 

111398 Hs.125585 R00086 deafness; X-linked 1; progressive 2.8 

134085 Hs.79018 U20979 chromatin assembly factor I (150 kDa) 23 

124788 Hs.100912 R43543 ESTs 23 

112248 KS326416 R51361 ESTs 2JB 

121309 H&97312 AA4Q2482 ESTs 24 

103076 Hs.75319 X59618 ribonucleotide reductase M2 potypeptkia 23 

107071 HsJSSW AA609053 ESTs 23 

104425 H&35380 H6849S ESTs 23 

132991 Hs.62245 AA446906 solute carrier family 25 (mitochondrial 23 

104968 H&296G9 AA084602 ESTs 23 

121153 Ks.97694 AA399640 ESTs 23 

131216 Hs£43901 031058 ESTs 23 

109682 Hs^2869 F09289 ESTs 23 

131990 Hs.168818 H77734 ESTs; Moderately simflar to roundabout 1 23 

132027 Hs.181444 N78844 ESTs; Weakly similar to R12C12.6 [Ceteg 23 

127383 Hs.190478 AA447990 ESTs 23 

132598 H&530 M81379 collagen; type (V; alpha 3 (Gooopasture 23 

101121 Hs.1313 L09753 tumor necrosis factor (Bgand) superfami 23 

123000 Hs.105640 AA479347 ESTs 23 

121329 Hs.1755 AA404324 ESTs 23 

100481 Hs.121489 HG1098-HT1098 CystatinD 2.7 

113803 HS2B3683 W42789 ESTs 27 

110934 Hs.169001 N48708 ESTs; WeaWy similar to cytochrome P450 2.7 

432888 T86823 ESTs 2.7 

121802 Hs.188898 AA424328 ESTs 2.7 

130398 Hs.155313 AB002331 Human mRNA for KIAA0333 gene; partial cd 2.7 

121103 H&97897 AA398936 ESTs; Weakly sfmflar to (deffins not ava 2.7 

131129 Hs.23240 R27296 ESTs 2.1 

130943 H&272429 D50B55 calcium-sensing receptor (hypocalciurtc 2.7 

134676 Hs£7819 W28051 ESTs; Weakfy sfrrflar to keratin 9; cytos 2.7 

111900 H&25318 R39044 ESTs 2.7 

106025 Hs.173334 AA412063 ESTs 2.7 

126144 Hs.40639 N39696 yx92a07.rl Soares melanocyte 2NbHM Homo 2.7 

103248 Hs.75262 X77383 cathepsInO 2.7 

127230 H&274170 H30501 Homo sapiens Opa-interacting protein OIP 2.7 

101584 HSJ4072 M35252 transmembrane 4 superfamily member 3 2.7 

124131 Hs.167489 H19980 ESTs 21 

Ha.77873 AA130156 ESTs 2.7 

H&9973 W92797 ESTs 2.7 

Hs.132967 AA347717 ESTs 2.7 

N23222 ESTs; Moderately sWar to HI! ALU SUB 2.7 

AA424881 ESTs 2.7 

AA203649 ESTs; WeaWy sbriHar to HEM45 [Hsaplens 2.7 

U64675 Human sperm membrane protein BS-63 mRNA, 2.7 

AA463627 ESTs 2.7 

028235 prostaglandirvendoperoxkJe synthase 2 (p 2.7 

AA262790 ESTs 2.7 

X64330 ATP citrate lyase 2.7 

AA099391 ESTs * 2.7 

AA424199 w81b01j1 SoaresjDtaLfetusJ^HTOJw 2.7 

R938Q2 ESTs 2.7 

AA323591 EST2S392 Cerebellum (I Homo sapiens cONA 2.7 

105031 Hs.12321 AA127240 ESTs 2.7 

126021 Hs.187518 AA775894 ESTs 2.7 

102116 U 13706 Human ELAV-Eke neuronal protein 1 1soto 2.7 

133394 H&237225 R16759 ESTs;Weakrystrnilarto(deflinenotava 2.7 

104267 Hs278439 C00358 ESTs 2.7 

107614 Hs.40241 AA004878 ESTs; Highly similar to (defltne not ava 2.7 

129809 Hs.1259 X55283 aslalogtycoprotein receptor 2 23 

112109 H&283309 R45221 ESTs; WeaWy similar to fill ALU SUBFAMI 27 

128422 T85681 yd60c06 jI Soares fatal Over spleen 1NF 2.7 

H*U3899 AA233702 ESTs 2.7 

H&292284 N72086 Homo sapiens RNA polymerase III largest 2.7 

106053 H&36727 AA416963 ESTs; HrghJy stmHar to hlstone H2A [its 2.7 

104440 H&284380 L20492 gammarglutarrryttransferase 1 2.7 
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132892 
120827 
134579 
106149 
132037 
130542 
122851 



H&256301 
H&332541 
Hs.179825 

Hs.99598 

50 134983 Ha.195384 
120537 Hs.160422 
131038 Hs.174140 
133889 H&211582 
128847 Hs.106529 
55 112755 HS306Q44 
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129426 Hs.111323 AA4120B7 EST; Highly similar to (deffine not aval 2.7 

123798 AA820411 small inducible cytokine A5 (RANTES) 2.7 

106716 H<l238928 AA464S62 ESTs 2.7 

103663 Z78291 Z78291 Homo sapiens brain fetus Homo sap 2.7 

S 114162 H&22265 Z38909 ESTs 2.7 

113063 Hs£Q27 T32438 ESTs 2.7 

127897 AA773857 af80c09/1 Soares.NhHMPu«S1 Homo sapiens 2.7 

130621 Ks.16803 AA621718 ESTs; Weakly similar to (delta not ava 2.7 

116245 Hs.42798 ' AA479958 ESTs; Highly similar to (defline not ava 2.7 

10 125499 R11878 yf49d1 1 jl Soares Infant brain 1 NIB Homo 2.7 

133960 H&77B99 M 19267 tropomyosin 1 (alpha) 2.7 

104470 H&246358 N28843 ESTs; Weakly similar to Similar to coBa 2.7 

134982 H&92308 N46086 ESTs 2.7 

106803 Hs.284295 AA479114 ESTs 2.7 

15 104899 Hs.285574 AA054726 ESTs 2.7 

125401 Hs.337585 AI204637 ESTs; Moderately similar to K1M0350 [H. 2.7 

111253 Hs.15768 N70042 ESTs; Moderately similar to IDI ALU SUB 2.7 

118449 Hs.164478 N66413 ESTs; Weakly similar to (defUne not ava 2.7 

134507 H&84318 M63488 replication protein A1 (70kD) 2.7 

20 121609 H&98165 AA416867 EST 2.7 

113835 H&27475 W56590 ESTs 2.7 

113962 H&285290 W86375 ESTs; Highly slmQar to {defline not ava 2.7 

121913 H&98558 AA428062 ESTs 2.7 

108194 Hs£16717 AA057250 ESTs 2.7 

25 130799 Hs.12696 AA464273 ESTs 2.7 

123184 Hs.18166 AA489072 Homo sapiens mRNA for WAA0870 protein; 2.7 

103420 Hs.173497 X97065 SEC23-Iike protein B 2.7 

106186 Hs.6315 AA427398 acetylserotonln N^ethyltransferase-Kke 2J 

101349 L77559 Homo sapiens DGS-B partial mRNA 2.7 

30 112954 Hs.6655 T16559 ESTs 2.7 

133054 Hs£91079 R07876 ESTs; Weakly similar to unknown [S.cerev 2.7 

128131 H&25640 AI283162 claudin3 2.6 

101864 Hs.75777 M95787 transgelin 2.6 

111948 Hs.26303 R40752 ESTs 2.6 

35 130145 Ks.151051 U07620 protein kinase mitogan-activated 10 (MAP 2.6 

126507 H&23964 AI362218 ESTs 2.6 

117903 Hs.47111 N50740 ESTs 2.6 

116345 Hs.199067 AA496981 ESTs 2.6 

132227 Hs.4248 AA412620 ESTs 2.6 

40 125746 H&274256 K03574 yj42b06j1 Soares placenta Nb2HP Homo sa 2.6 

105073 Hs.89463 AA137034 ESTs 2.6 

102764 U82310 Homo sapiens unknown protein mRNA, parti 2.6 

131367 Hs.173933 AA456687 ESTs 2.6 

130782 Hs.19500 AA307896 nuclear localization signal deleted In v 2.6 

45 107427 Hs.46738 W26975 ESTs 2.6 

117477 Hs.44175 N3032S ESTs 2.6 

106290 Hs.16364 AA435542 ESTs 2.6 

126829 Hs.7910 R11547 ESTs 2.6 

116836 Hs.173001 N79820 ESTs 2.6 

50 100147 Hs.136348 D13666 osteoblast specific factor 2 (fascicfin 2.6 

104278 Hs.109253 C02582 ESTs; Highly similar to (defline not ava 2.6 

135051 H&83484 C15324 ESTs 2.6 

126081 H&227835 AI346024 collagen; type I; alpha 1 * 2.6 

123579 AA608983 ai5d4.s1 Soares_testis_NHT Homo sapiens 2.6 

55 130115 Hs.149923 M31627 X-boxWrufiig protein 1 2.6 

101434 Hs.1430 M20218 coagulation factor XI (plasma thrombopta 2.6 

122962 H&104720 AA478429 ESTs; Moderately similar to !W ALU SUB 2.6 

126151 Hs.40808 AA324743 ESTs 2.6 

128925 Hs£1851 D61676 Homo sapiens mRNA; cDNA DKFZp586J21 1 8 (f 2.6 

60 128919 Hs.103391 L27559 insulin-like growth factor binding prote 2.6 

130296 Hs.154103 R09288 UM protein (similar to rat protein kina 2.8 

128402 Hs.191637 AA457244 ESTs 23 

129273 Hs.109968 W63783 ESTs 2.6 

125483 Hs.7788 F07759 ESTs 2.6 

65 132953 H&321264 AA029927 ' ESTs 2.6 

130963 HSJ21639 U57099 nuclear protein; marker for differentiat ZJb 

120614 Hs.194154 AA284281 ESTs; Weakly similar to 181 ALU SUBFAM) 2.6 

123251 Hs.103267 AA490858 ESTs; Moderately similar to Rabln3 (Rjw 2.6 

121710 Hs56744 AA419011 ESTs 2J 
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125428 Hs.851 W74608 


ESTs; Highly similar to (dofBns not ava 


2.6 


115906 Hs.82302 AA436616 


ESTs 


2Jo 


108432 AA07662S 


Homo sanlfios dons 23651 mRNA seauonos 


2.6 


126191 Hs.191911 H9772B 

1 fcW 191 no. ID 191 1 1 1 W F f fa W 


ESTs 


2.6 


106164 Hs 281434 AA425773 


ESTs 


2.6 


111519 H&268615 R08165 


ESTs 


2.6 


134590 Hs.173840 W58612 

IwHwOw no. 1 / OQ*tw fVwwwIfc 


ESTs 


2.6 


102565 U59746 


Human dssart hedaahoo OiDHHI mRNA. Daiti 


2.6 


1PQA79 He 11109 AA194973 


ESTs 


2.6 


114264 Hs_33A809 740074 


ESTs 


2-6 


106236 He 91104 AA429951 


ESTs 


2,6 


135192 Hs 321709 AR30Q234 


□urinannc recGDtor P2X: fioand-oated k> 


2.6 


109833 Hs.29889 H 00580 


ESTs 


2.6 


105756 He 8535 AA303Q88 

Iwwrwv ri9AKM9 /Wwuoo 


ESTs: Weakfv similar to transforrnafiorw 


2.6 


191499 He Q70K7 AA406210 


ESTs 


2& 


130417 Hs 155485 II 58522 


Human huntinnfin interactina nroteln (H\ 


ZJo 


194319 He 109390 H94647 


ESTs 


IB 


108988 He 971 99 AA 156 058 


ESTs 


2.6 


127081 Hs_1 80591 R883S2 
lef wu 1 no. lousoi noiAjut 


ESTs; WeaWy similar to weak similarity 


2£ 


190574 He 11463 AA45flR03 
no. 1 l*H)0 n/vwoouo 


PSTe* Waaidv similar to fdeffina not ava 

Uw 1 Of wVDOAJJ 9HIUKU IW ^UOIiUw 1 lUi BIB 


2.6 


119410 He 2R9DA R61680 


ESTs 


2.6 


123929 Hs.1 12981 AA621364 


ESTs 


2.6 


looom; He ifuftw AAA70070 

IccoUS no. IwhOwO /vW/Uu/U 


co la 


2.6 


116300 He 110837 AA599729 

1 IwwSO no. 1 lUww/ fVW99i£9 


Homo saolens homeobox nroteln A10 fHOXAl 

i IWIIIW WQWWIIO 1 »Wi 1 RWl/wA piWIWIIf HIV ^1 IWIV1I 


2& 


100070 H«153Q3A AA494044 

IOU£/o n&lwwwv'r rWtfctVrr* 


mra-hlrtdinn factor runt domain* ateha 

vviwini luiiiy kimwi | luiu wwiiioiii| Gupim 


2.6 


1CYW91 He 1AA5 M944.70 

1 ouuc i no. i*wo iwi£«***/vi 


mmnnetnp mnnnnhnenKate mdnrlfteA 


2,6 


IfYiroc Uc1QQ1R0 Hfi9387-HT24fi3 
IwwwKV no.l99lww nw«0/TiU«n» 


Trithoray Homafao Hnr 


2^ 


ifwoec Ue 9/1177 AA084104 


ESTs 


2.6 


117711 Hs 46485 N45201 


EST 


2^ 


194709 He 40719 B44357 


ESTs 


2& 


111 9QO He ^ M7QflA0 

nityy ns./*k}io in/oouo 


Colo 


2£ 


IfWAIft He •49071 746Q73 
lUODlO 110.020/ 1 Z.W0/0 


puuopiiwiiiuoiLiuo w^rui taoo r uaoo w 


2.6 


133699 He 105614. D 13642 

1 wwPfcg no. lOWD 1** UlwwtC 


KIAA0017 nana oroduct 


2£ 


196464 He 169977 A! 086782 


ESTs 


2.6 


1 nn 856 H A42454fT451 5 

IUU0OO nUrffcHOTI IHw 19 


Forkhcad FamBv Afyrl 

rujivicau ran my rviAi 


2.6 


133547 He 301 097 Y02B83 


T-efill rfif»ntor ateha A/'DihCl 


2.6 


lORftflrt He1Q3R£5 Pn7fW7 
IcODOU nS. 1*3000 rVfWf 


COIB 


2.6 


195739 He 09137 AA428557 


v-mvc avian mvetaevtomatosls viral on con 

ViiljW arail iiijoiwwjiwiiiBUJOM luoi wiii^fjj 


2.6 


102276 Hs.1 0247 Lf 30999 


Human fmamcl mRNA. 3UTR 

■ iuii toil ^iiiui tiwy mi u w«| wwiii 


2.6 


incccft He1Q1R36 AA979137 

luoooo no. 19 1000 rwioiof 


CO 1 0 


2.6 


103078 Hs 34136 AA307443 


ESTs 


2.6 


IfiSKn rloxOwwUI iOVAJ££ 


FSXs* WmIoV similar to fdaflina not ava 


2.6 


11 4919 He 91901 739338 


ESTe* Hinhh/ simitar to fdafiine not ava 

tuis, ruyiuy ouiuku lu ^uoiuiio iiwiava 


2.6 


11SQC0 He 40099 H7Q310 


EST 
cot 


2.6 


100990 He 306005 AA1Q3366 


PRTe 
CO io 


2.6 


144000 Hc7A909 ! I9Q17H 
100909 tlS.faC.fJC U£9 1 f O 


RWl/^IF rfiteted* matrix assorted* adi 


2,6 


100640 Hs_182183 HG2743-HT2B45 
iUUD*tv no. lot low notf'fwiiifiw'w 


CaMasmon 1 Alt SoDce 3. Nan-Muscle 

VB MWI 1 IWI 1 • | nib W^UWU Wj I *W 1 IflWMH 


2.6 


103009 He 965006 AA5Q8749 
IOOUm n5X(w9gO nnOOO/*ro 


CO 1 a 


2& 


114306 He 6540 740861 


ESTs 


2Jo 


incncn Ho1719Q1 AA4179A7 
lUOUOw no. 1 r 1 wo 1 /vWI/cO/ 


PJormlnnl hlnrlkin nmtnin 9 


ZS 


107748 Hs.60772 AAG17258 

i ui (hq no. WW! » C WW 1 « WW 


EST 


2£ 


100134 H e 49 n 13264 


maarnnhana sffivanoar racsotor 1 
iimuupitttya owavoiiyoi iwvopiwi i 


2J5 


1O00CQ He 7ft 1113044 
lwwouo rl5./0 U10UW 


ftA-hlnrflnn nmtaln tranerrintifffl factor 


2S 


iOAOQO Ue 74016 AA4W001 
1wv9o£ no./ *W 10 /VVtOOwvl 


ESTs 


2JS 


127493 H&291701 AA808081 


oc39a08.s1 NCLCGAPJ5CB1 Homo sapiens cO 


25 


132869 H&203961 N26855 


ESTs 


2JS 


117570 Hs.44583 N34415 


EST 


2.5 


124644 Hs.109654 N91279 


ESTs 


2J5 


103558 Hs£785 Z19574 


keratin 17 


25 


132883 H&5897 AA047151 


ESTs 


2JS 


1Q2009 H&82643 U02680 


protein tyrosine kinase 9 


25 


116058 H&20159 AA454156 


ESTs 


2* 


121989 Hs.193784 AA430044 


ESTs 


25 


131257 H&2490B AA256042 


ESTs 


25 


100320 Hs.75275 D50916 


homotog of yeast (S. cerevisiae) ufd2 


25 


102959 Hs.121524 X15722 


glutaGnione reductase 


25 


132969 Hs.6166 AA047616 


ESTs 


25 


130869 Hs2057 AA128100 


uridine monophosphate synthetase (orotat 


25 


129645 Hs.1 18131 L38928 


5;1C^menyltetrahydrofolate synthetase 


25 
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126399 H&838B3 AA12B075 
134069 Hs.78935 U29607 
109816 Hs.61980 F11013 
134801 Hs.89595 XQ2160 
104232 Hs.10587 AB002351 
107361 Hs.159486 U72513 
106057 HSJ289074 AA417067 
134252 Hs.80720 AA031782 
128062 H&105547 AA379500 
110009 Hs.6614 H10933 
111375 H&20432 N93696 
122642 Hs.99361 AA454166 
127999 H&69851 AA837495 
105029 Hs.13268 AA126855 
105082 HJL26765 AA143763 



z!16d08/1 Soares jregnantutsrus JlbHPU 25 

Homo sapiens elF-2-assodated p67 homolo 25 

ESTs; Weakly similar to K1AA0176 [H.sapJ 25 

insulin receptor 25 

Human mRNA for K1AA0353 gene; partial cd 25 

Human RPL13-2 pseudogene mRNA; complete 25 

ESTs 25 

Homo sapiens mRNA; cDNA DKFZp5B6B1722 (f 25 

ESTs 25 

ESTs 25 

ESTs 25 

ESTs 25 

ESTs; Weakly simflar to Wiskott-Aldrich 25 

ESTs 25 

ESTsiWeaJdysIrriflartoSlniaaritytoS. 25 
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TABLE 1 A show the accession numbers for those primekeys lacking unigenelD's for Table 
1. For each probeset we have listed the gene cluster number from which the oligonucleotides 
5 were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



35 



Pkey: 

CAT number: 



10 



15 



20 



25 118417 



30 



Unique Eos probeset identifier number 
Gene cluster number 
Genbank accession numbers 



Pkey CAT number 



111555J 

1596090J 

1606216J 

32479J 

48158.-7 

1562851J 

1708455J 

37186 : 1 



126023 
126086 
102565 
101964 
125499 



125661 327827.1 
125957 1583542J 
125982 1766315.1 
127248 227560.1 
103731 112052.1 
1Z7261 231687.1 
127265 232391.1 
128659 1541209.1 
40 127315 37938.1 
103806 112618.1 
128104 502608J 



45 



128152 
128422 
127897 
106566 



297868J 
1811283.1 
446527.1 
120358J 



50 129735 44573.2 



55 



60 123147 
130529 
123579 
109175 
100789 

65 100858 



219802.4 
158447.1 

genbank_M6Q8983 
genbanlUW 80496 
figr_HT4163 
$Sr_HT4515 



AA071210 AA069899 AA071438 AA084912 AA084803 AA079371 AA079370 

H57661 H58881 

H756S1H7Q975 

AB010994U59748AA064660 

S81578 

H10543R11878 
R25698 R56582 R56018 

AF080229 AF080231 AF080230 AF080232 AF080233 AF080234 BE550633 A1636743 AW614951 BE467547 AI680833 
AI633816 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 AI970376 AI583718 AI672574 
N25695 AW665466 AIB18326 AA126128 AI480345 AW013827 AA248638 AI214968 AA204735 AA2071 55 AA206262 
AA204833 AW003247 AW496808 AI080480 AI531703 AI551023 AI867418 AW818140 AA5Q2500 A1206199 AI671282 
AI352545 BE501030 A1S52535 BE465762 AA206331 AW451B66 AA471088 AA206342 AA204834 AA206100 AW021661 
AA332922 N66048 AA703396 H92278 AW139734 H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 
BE46G61 1 A1206344 AA574397 AA348354 Af493192 

AA491830 R50173 R55192 R50320 AI732306 AI7323Q5 A1820727 AI820728 R55191 R50319R50227 

K41694 H45213 

R98091 W92898 

AA364195 AA325029 AW962050 

AA070545 AA131490 AA131 373 

AA330501 AA661557 

AA331503 AA33Z751 AW962542 

T16245 R19694 F13545 H1Q299 T66048 T65279 H18006 

AF1 16622 AM 14507 AA640834 AA377999 

AA130614AA071410 

AA906093AA971000 

K47610R86920 

F07973 R20353AA442660 

T77794T85681 

AA773681AA773857 

BE298210 AI672315 AW086489 BE298417 AA455921 AA902537 BE327124 R14963 AA085210 AW274273 AI333584 
AI369742 AI039658 AI885095 AI476470 AI287650 A1885299 AI985381 AW592624 AW340136 AI266556 AA456390 
AI310815AA484951 

AI950087 N70208 R97040 N36809 AI3081 19 AW967677 N35320 A1251473 H59397 AW971573 R97278 W01059 
AW967671 AA908598 AA251875 A1820501 A1820532 WB7891 TB5904 U71456 T82391 BE328571 T75102 R34725 
AA884922 BE328517 AI219788 AA884444 N92578 F13493 AA927794 AI560251 AW874068 AL134043 AW235363 
AA663345 AW008282 AA488954 AA283144 A1890387 A1850344 A1741346 A1689062 AA282915 AW102898 A1872193 
AI763273 AW173586 AW150329 A1653832 AT762688 AA988777 AA488892 AI356394 AW1 03813 AI539842 AA642789 
AA856975 AW505512 AI961530 AWS29970 BE612861 AW276997 AW513601 AW512843 AA0442O9 AW856538 
AA180009 AA337499 AW961101 AA251669 AA251874 AI819225 AW205862 AI683338 A1858509 AWZ76905 AI633006 
AAS72584AA908741 AW072629 AW513996 AA293273 AA969759 N75628 N22388 H84729 H60052 T92487 AJ022058 
AA780419 AA551005 W80701 AW613456 AJ373032 AI564269 F00531 H83488 W37181 W78802 R66056 AKXJ2839 
R67840 AA30Q207 AW959581 T63226 FO4Q05 
AA487961 

AA1 78953 AA192740 

AA608983 

AA180498 



U10072 
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123788 579959 J 



AA620411 AA287491 



102116 entra*JJ13706 U13706 
102398 entre^U42359 U42359 
102764 enire2L.U82310 U82310 
5 118475 genbanU*66845 N66845 
104776 genbank_MG26349 AA026349 
104787 genbanK_M027317 AA027317 
113702 genbankJ97307 T97307 
113938 genbanK_W81598 WB1598 
10 122635 genbanK_AA454085 AA454085 
108407 genban)LAA075519 AA075519 
108432 genbankJ\A076626 AA076626 
108555 genbank u AA084963 AA084963 
101349 entre2LL77559 L77559 
IS 124447 genban)LN48000 N48000 
119071 genbanK.R31180 R31180 
103520 ertrezjn0511 Y10511 
103863 genbankJZ7B291 Z78291 



123465 genbanlLAA599033 AA599033 



128046 877605J 
20 126959 546044.1 



AA873285AI025762 
AA1 99853 AA206355 
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TABLE 2: shows a preferred subset of the Accession numbers for genes found in fable 1 
which are differentially expressed in prostate tumor tissue compared to normal prostate 
tissue. 



Pkey: Unique Bos probeset identifier number 

ExAccrc Exernplar Accession number, Genbank accession numbar 

UnjgenelD: Unigene number 

10 Unigene TBIe: Unigene gene (Be 

R1: Rafioof tumor to normal body tissue (Relaxed ratio (87/70) 
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20 



25 



30 



35 



40 



45 



50 



55 
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Pkey ExAccn 


UnigenelO Unigene Title 


R1 


131919 AA121266 


K&272458 ESTs 


372 


120328 AA196979 


H&290905 ESTs; Weakly simflar to (define not ava 


323 


101486 M24902 


Hs.1852 add phosphatase; prostate 


252 


119073 R32894 


H&279477 ESTs 


243 


133428 M34376 


Hs.1 83752 microsemlnoproteln; beta- 


233 


128180 AA595348 


Hs.171995 kalllkrein 3; (prostata specific antigen 


214 


104080 AA402971 


Hs57771 Homo sapiens mRNA for serine protease (T 




127537 AA569531 


Hs.162859 ESTs 


183 


131665 R22139 


Ks.30343 ESTs 


174 


101050 10)1911 


Ks.1832 neuropeptide Y 


173 


130771 N48056 


Hs.1 91 5 lotata hydrolase (prostate-specific memb 


17 


107485 W63793 


Hs262476 S-adertosytmethlomne decarboxytase 1 


16.7 


106155 AA425309 


Hs.33297 ESTs 


163 


129534 R73640 


Hs.1 1260 ESTs 


164 


100569 HG2261-HT2351 


Antigen, 


101889 S39329 


Hs.1 81350 katoln 2; prostate 


154 


135389 U05237 


Hs.99872 fetal Alzheimer antigen 


15 


133944 AA045870 


Hs.7780 ESTs 


123 


130974 X57985 


H&2178 H2B hlstone family; member Q 


113 


114768 AA149007 


Hs.182339 ESTs 


113 


104660 AA007160 


H&14846 ESTs 


114 


131061 N64328 


H&268744 ESTs; Moderately sfmDar to K1AAQ273 [H. 


103 


128645 AJ167042 


Hs.61635 Homo sapiens BAC clone RG041D11 feom7q2 10.7 


135153 N40141 


Hs.95420 Homo sapiens mRNA for JM27 protein; comp 103 


107033 AA599629 


Hs.1 13314 ESTs 


103 


118417 N66048 


ESTs; Weakly stm2ar to polymerase pisa 


103 


126758 W37145 


Hs283960 ESTs 


102 


107102 AA609723 


H&30652 ESTs 


10.1 


116787 H28581 


Hs.15641 ESTs 


10.1 


115719 AA416997 


H&59622 ESTs 


10 


123209 AA489711 


H&203270 ESTs 


93 


101664 M60752 


Hs.121017 H2A hlstone tarriDy; member A 


93 


112971 T17185 


Hs33883 ESTs 


9J 


117984 N51919 


Hs.106778 ESTs 


9.7 


129523 M 30894 


H&274509 T-ceB receptor; gamma cluster 


94 


132984 AA031360 


Hs.167133 ESTs 


92 


121853 AA425887 


H&98502 ESTS 


9 


119617 W47380 


H&55999 ESTs 


83 


105627 AA281245 


Hs23317 ESTs 


83 


101461 M22430 


Hs.76422 phospholipase A2; group HA (platelets; 


8.7 


124526 N62096 


Hs293185 yz61c5.s1 Soares^multlpte^scleroste^NbH 


83 


133845 T68510 


Hs.76704 ESTs 


82 


133354 AA055552 


Hs334762 ESTs; Weakly similar to K1AA0319 [Ksapi 


8.1 


119018 N95796 


H&278695 ESTs 


8 


100394 D84276 


Hs36Q52 CD38 antigen (p45) 


8 


106579 AM56135 


HS23023 ESTs 


73 


114965 AA250737 


Hs.72472 ESTs 


74 


112033 R43162 


Hs.22627 ESTs 


7.1 


102398 U42359 


Human N33 protein form 1 (N33) gens, exo 


7 


101201 L22524 


Hs2256 matiUmetalloproteinase 7 (maMystn; 


63 


101803 M86546 


Hs.155691 pi»e<^teul^trai«cr4)fion factor 


63 


120562 AA280036 


Hs502267 ESTs; Weatfy sfmHar to W01 A6.c (C.etega 


63 




118 





Prostate Specific Aft. Splice 16 



WO 02/30268 



PCT/US01/32045 



109112 AA1 69379 H&257924 ESTs 63 

109795 F10707 Hs326416 ESTs 6.7 

130336 X07730 Hs. 171 995 kaffikreh 3; (prostote specific anfigan 63 

131425 AA219134 H&26691 ESTs 6.6 

5 132602 AA490969 HS39838 ESTs 63 

133724 U07919 Hs.75746 aldehyde dehydrogenase 6 63 

120215 Z41050 Hs. 108787 Homo sapiens Mcd4p homoiog mRNA; comptet 63 

131681 AA010163 Hs3383 upstream regulatory element binding prot 63 

100727 X07290 HS334786 Human HF.12 gene mRNA 63 

10 121770 AA421714 Hs278428 Homo sapiens mRNA for K1AA0896 protein; 63 

123475 AA599267 HS250526 ESTs; Weakfy sfmfiar to ANKYRIN; BRAIN V 63 

133061 AB00Q584 Hs296638 prostate differentiation factor 63 

116429 AA609710 H&279923 ESTs; Weakly similar to similar bQTP-b 62 

101233 L29008 HsJTB sorbitol dehydrogenase 6.2 

15 104691 AA011176 Hs37744 ESTs 62 

127248 AA325029 EST27853 Cerebellum (1 Homo sapiens cDNA62 

105500 AA256485 Hs222399 ESTs 6.1 

130828 AA053400 Hs203213 ESTs 53 

115357 AA281793 H&.72988 ESTs 53 

20 116334 AA491457 Hs.48948 ESTs 57 

120132 Z38839 Hs.125019 ESTs; Weakly similar to ifll ALU SUBFAMI 53 

106375 AA443993 H&269072 ESTs 53 

124777 R41933 Hs.140237 ESTs; Weakly similar to neuronal thread 53 

101791 MB3822 Hs.62354 Human beige-ike protein (BGL) mRNA; par 53 

25 117698 N41002 Hs.45107 ESTs 53 

122041 AA431407 Hs.98732 Homo sapiens Chromosome 1 6 BAC dona CIT 53 

133723 AA088851 H&2G2476 S^denosylmethlonine decarboxylase 1 53 

113938 W81598 ESTs 5.4 

133015 AA047036 H&246315 ESTs 54 

30 108186 AA056482 Hs.7780 ESTs 53 

104468 N25110 Hs326392 Human guanine nucleotide exchange factor 53 

104033 AA365031 H&98944 ESTs 53 

110844 N31952 Hs.167531 ESTs; Weakfy similar to (defflne not ava 53 

129056 H70627 Hs.108336 ESTs; WeaWy similar to 111! ALU SUBFAMI 53 

35 133493 AA284143 Hs.194369 Homo sapiens chromosome 1 atrophln-1 rel 53 

129184 W28769 Hs.109201 ESTs; Highly similar to (deffine not ava 62 

101448 M21389 Hs.195850 keratin 5 (epidermolysis bullosa simplex 5.1 

116188 AA484728 Hs.184598 ESTs; WeaWy similar to llU ALU SUBFAMI 5.1 

105921 AA402613 Hs.169119 ESTs 5.1 

40 103375 X91868 Hs34416 sine ocufe homeobox (Orasophfla) homolo 5.1 

128871 AA400271 Hs. 106778 ESTs; Highly similar to (detHne not ava 5.1 

116238 AA479362 H&47144 ESTs 5 

102913 X07696 Hs.80342 keratin 15 5 

103011 X52541 Hs326035 early growth response 1 5 

45 118981 N93839 Hs39288 ESTs; Weakly similar to !0I ALU SUBFAMI 5 
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TABLE 2A shows the accession numbers for those primekeys lacking unigenelD's for Table 
2. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



10 



15 



20 



25 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 



Pkey 



127248 
107033 

102398 
113938 



CAT number Accession 



118417 37186 J AFD80229 AF080231 ATO80230 AF080232 AFD80233 AF080234 BE550633 A1636743 AW614951 BE467547 AI680833 

AK633S18 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 AI97Q376 A1583718 A1672574 
N25695 AW665466 AI818326 AA1261 28 AI48Q345 AW013827 AA248638 AI2149S8 AA204735 AA207155 AA206262 
AA204833 AW003247 AW496808 AI080480 AJ531703 AI651Q23 AI887418 AW818140 AA502500 AI206199 A1571282 
AI352545 BE501030 AI652535 BE465762 AA206331 AW451886 AM71088 AA206342 AA204834 AA206100 AW021661 
AA332922 N66048 AA703396 H92278 AW139734 K92683 U87589 U87595 K89001 U87594 BE465420 AI624817 
BE466611 A1206344 AA574397 AA348354AI493192 
227560 1 AA3641 95 AA325029 AW96205Q 

235652 J A1141999 AA730176 R44544 R41778 AW300793 AW966157 AA918501 AA599629 AJ082195 AI198537 AWO06520 

AW236663 AW151420 AIB26987 AI810832 A1669102 AI201981 N27331 AA335566 T84622 BE085347 BE085269 
entre^U42359 U42359 



genbanLW81598W81598 
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TABLE 3: shows genes, including expression sequence tags, differentially expressed in 
5 prostate tumor tissue compared to normal tissue as analyzed using the Affymetrix/Eos Hu02 
GeneChip array. Shown are the relative amounts of each gene expressed in prostate tumor 
samples and various normal tissue samples showing the highest expression of the gene. 
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nosy: Unique Bos prabeset Jdentffier number 

ExAccn: Exemplar Accession number, Qenbank 

UnlgenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: Ratio of tumor to normal body tissue 



Pkey ExAccn UnlgenelD Unigene Title 



00131 
00235 
00570 
00819 
01063 
01247 
01416 
01447 
i01485 
101514 
01626 
101663 
101758 
01768 
101817 
101888 
02031 
02052 
102221 



D12485 Hs.11951 
D28954 Ks.13421 
HG2261-HI2352 
HQ4020-HT4290 
L00354 Hs.80247 



02302 
02348 
02457 
02473 



02751 



03043 
03093 
03378 
03401 
03613 
03677 
03962 
04084 
04257 
04301 
04769 
04851 
04896 
04956 
04957 
04967 
05099 



L33801 

M17254 

M21305 

M24736 

M28214 

M57399 

M60750 

M77838 

M81118 

M88163 

M99701 

U04898 

U07559 

U24578 

U26173 

U33052 

U37519 

U48807 

U49957 

U71207 

U75272 

U60034 

U90914 

XQ2544 

X54667 

X55733 

X60708 

X92098 

X95240 

ZA662B 

Z83806 

AA298180 

AA410529 

AP006265 

D45332 

AA025887 

AA040882 

AA054228 

AA074880 

AA074919 

AA084506 

AA150776 



Hs.78802 
Hs.279477 

Hs.89548 

Hs.123072 

Hs.44 

Hs.2178 

Hs.79217 

Hs.78989 

HS.1S22B2 



H&2156 

HsJ05 

Hs5844 

Hs.78334 

HS59171 

Hs57539 

H&2359 

Hs.180398 

HSJ29279 

Hs.1867 

Hs.68583 

Hs5Q57 

H&572 

Hs.123114 

Hs.93379 

Hs.44926. 

Hs.323378 

H5L54431 

H&2316 

Hs53243 

Hs.30732 

Hs5222 

Hs.6783 

Hs£93943 

Hs.10290 

Hs£3165 

H&20509 

Hs.10026 

H&291000 

Hs23729 

H&28389 



phosphodiesterase 1/nudeofide pyrophosp 

KIAA0056 protein 

Ha.171995 

H&2387 

cholecystoJdnfo 

glycogen synthase kinase 3 beta 
v-ete avian erythroblastosis virus £26 o 
Human alpha satellite and satellite 3 ju 
se lectin E (endothe&al adhesion molecul 
RAB3B; member RAS oncogene famfly 
plelotrophln (heparin binding growth fac 
H2B histone family; member A 
pynoHne-5-carboxylate reductase 1 

SW1/SNF related; matrix associated; acfl 

transcription elongation factor A (S 11 )- 

RAR-related orphan receptor A 

ISL1 transcription {actor; UM/homeodoma 

UM domain on ly 4 

nuclear factor; interteukfn 3 regulated 

protefo kinase C-Gke 2 

aldehyde dehydrogenase 8 



UM domain-containing preferred transtoc 
eyes absent (DrosophBa) homolog 2 
progastrfcsfn (pepsinogen C) 
mitochondrial Intermediate peptidase 
carboxypeptkiase D 
orosomucotdl 
cystafinS 

eukaryotfc translation Initiation factor 
dipeptidylpeptidase IV (CD26; adenosine 
coated veside membrane protein 
specffic granule protein (28 kDa); cyste 
SHY (sex-determining region Y)-box 9 (ca 
H^aplens mRNA for axonemal dyneln heavy 
ESTs 
ESTs 

estrogen receptor-binding rragrrerrt-assoc 
ESTs 

ESTs; Weakly similar to HD AW SUBFAMJ 
U5 snRNP-spectfic 40 kDa protein (hPrpB* 
ESTs 

ESTs; WeaWy similar to hypothetical pro 
ESTs; Weakty similar to ORF YJL063C [S.c 
ESTs 

Homo sapiens dons 24405 mRNA sequence 
ESTs 

121 



R1 

63 
5.1 

Antjgen, Prostate Specific, AH Splice 

Transglutaminase 105 

85 

4.7 

4.7 

11 

93 

62 

8.4 

45 

54 

75 

55 

5.7 

132 

85 

5J8 

74 

B2 

55 

5.1 

5.7 

9 

105 

155 

45 

225 

4.7 

45 

55 * 

S2 

74 

5.2 

45 

6 

64 

65 

105 

65 

45 

55 

64 

45 

65 

7 

5.1 
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105304 AA233553 Hs, 190325 ESTs 4.7 

105370 AA236476 H&2Z791 ESTs; Wealdy similar to transmembrane pr 103 

105427 AA251330 H&28248 ESTs 5 

105542 AA261858 Hs366957 ESTs; Weakry similar to heat shock prote 85 

5 105628 AA281251 Hs.79828 ESTs; Weakly similar to putative zfnc fl 55 

105640 AA281623 Hs.6685 ESTs; Weakly similar to KIAA0742 protein 8 

105845 AA282138 Hs.11325 ESTs 14 

105691 AA287097 Hs389D68 transcription factor 4 63 

105730 AA292701 Ha5364 DKFZP564I052 protein 45 

10 105608 AA393803 H&286131 K1AAD438 gene product 7 

105826 AA398243 Hs. 194477 ESTs;MooaratBlysfrn3artosfrn2artoN 5 

105903 AA401433 Hs200016 ESTs; Weakly similar to (fiphosphotnosito 9.9 

105906 AA401633 Ha22380 ESTs 115 

106065 AM17558 H&25206 ESTs 5.1 

IS 106094 AM19461 H&23317 ESTs 105 

106157 AA425367 HS54892 ESTs 6.6 

106184 AA426643 Hs.10762 ESTs 85 

106211 AA426240 Hs. 126083 ESTs 8.4 

* 106213 AA428258 Hs5769 Homo sapiens mRNA; cDNA DKFZp564E153 (fr 5.7 

20 106272 AA432074 Hs523099 ESTs 53 

106369 AA443828 Hs388856 ESTs 63 

106400 AA447621 Hs54109 ESTs 6.4 

106474 AA450212 H&42484 Homo sapiens mRNA; cDNA DKFZp564C053 (fr 92 

106507 AA452584 Hs367819 .protein phosphatase 1; regulatory (Wiib 5.6 

25 106523 AA453441 Hs31511 ESTs 4.7 

106532 AA453628 Hs37443 ESTs 4.7 

106557 AA455087 H&22247 ESTs 5.7 

106575 AA456039 Hs. 105421 ESTs 72 

106618 AA459249 Hs5715 ESTs; Weakly similar to Similarity with 55 

30 106820 AA481Q37 Hs. 12592 ESTs SA 

106846 AA485223 Hs.34892 ESTs 53 

106973 AA505141 Hs.11923 Human DNA sequence from done 1 67A19 on 75 

107110 AA809952 Hs.12784 KIAAQ293 protein 6.1 

107127 AA620504 Hs.179898 ESTs 7.1 

35 107159 AAS21340 Hs.10600 ESTs; Weakly slmBar to ORF YKR081C [S.c 52 

107217 D51095 Hs35861 DKFZP586E1 621 protein 15.1 

107365 1178294 Hs.111256 arachidonate 15-Bpoxygenase; second typ 4.7 

107630 AA007218 Hs50178 ESTs 53 

107734 AA016225 Hs.7517 ESTs 43 

40 107760 AA018042 H&252085 EST 75 

107997 AA037388 Hs52223 Human DNA sequence from clone 1 41 HS one 105 

108012 AA039616 Hs.173334 ESTs 65 

108520 AA084138 Hs.46786 ESTs 75 

108583 AA088276 Hs58826 ESTs 55 

45 108613 AA100967 Hs59165 ESTs 6 

108664 AA113349 Hs59588 EST 63 

108677 AA1 15629 Hs.1 18531 ESTs 53 

108807 AA1299B8 H&49376 ESTs; Weakly similar to PROTEIN PHOSPHAT 53 

108910 AA136590 ESTs 5 

50 108933 AA147224 Hs337232 ESTs 12.7 

108948 AA149579 Hs.118258 ESTs 63 

109014 AA156780 Hs362038 ESTs 153 

109124 AA171529 Hs.183887 ESTs 6.1 • 

109142 AA176438 H&41295 ESTs 5.1 

55 109277 AA196332 Hs36043 ESTs 55 
109342 AA213620 Homo sapiens mRNA; cDNA DKFZp586M1418 if 6 

109562 F01811 Hs. 187931 ESTs; Moderately similar to voltage-gate 105 

109565 F01930 H&23848 ESTs 7 

109648 F046CO Hs.7154 ESTs 95 

60 109799 F10770 Hs.180378 Homo sapiens done 669 unknown mRNA; com 64 

109859 H02308 H&20792 ESTs 53 

110181 H20276 H&31742 ESTs 165 

110854 N32919 H&27931 ESTs 10 

110924 N47938 Hs.12940 yy84aQ9.s1 Soares^niuttipIe_sclerosfe^Nb 5.6 

65 111048 N55514 Hs318584 ESTs 63 

111091 N59858 Hs33032 Homo sapiens mRNA; cDNA DKFZp434N185 (fr 53 

111157 N66613 Hs59364 ESTs 5 

111164 N66857 Hs.122489 ESTs; Weakly stmOar to fill ALU CLASS C 55 

111221 N68869 Hs.15119 ESTs 63 
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111348 N90041 Hs3585 ESTs 54 

111353 N90430 Hs.6618 ESTs 53 

111495 R07210 H&9683 ESTs 53 

111540 R08850 Hs3786 ESTs 6 

5 111579 R10657 Hs.167115 K1AA0830 protein 123 

111581 R10684 H&5794 ESTs 7.1 

111734 R25375 Hs.128749 ESTs 62 

111861 R37460 Hs25231 ESTs 94 

111870 R37778 Hs.18685 ESTs; Weakly similar to hypothetical pro 63 

10 111937 R40431 Hs.14846 Homo sapiens mRNA; cDNA DKFZp564D01 6 (fr 4.8 

111987 R42036 Hs.6763 K1AAQ942 protein 6.4 

112184 R49173 Hs330242 ESTs 5.6 

112286 R53765 Hs.158135 KIAA0981 protein 93 

112380 R59740 Hs3740 ESTs 4.7 

15 112452 R63841 Hs.157461 ESTs 6 

112601 R79111 Hs.78225 annexlnAI 54 

112753 R93696 Hs.169882 ESTs 5.8 

112902 T09262 Hs.129190 ESTs 5.1 

112984 T23457 Hs289014 ESTs 4.9 

20 113021 T23855 Hs.129836 WAA1G28 protein 103 

1 13083 T40530 Hs268957 ESTs; Weakly simBar to heat shock prate 5.7 

113200 T57773 Hs.10263 ESTs 73 

113494 T88878 Hs36538 ESTs 8.7 

113849 W60439 Hs3858 ESTs; Moderately similar to cbp146 [Mjnu 43 

25 113883 W72382 Hs.11958 oxidative 3 alpha hydroxysteroid dehydro 4.7 

1 13950 W65765 Hs30504 Homo sapiens mRNA; cDNA DKFZp434E082 (fr 6.7 

113988 W87462 Hs21894 ESTs 53 

113989 W87544 H&268828 ESTs 4.7 
1 14124 Z38595 Hs.125019 ESTs; Highly similar to KIAA0886 protein 213 

30 114340 Z41395 Hs.143611 ESTs 93 

114346 Z41450 Hs.130489 ESTs 52 

114435 AA018216 Hs.164975 Bicaudal D (Drosophfla) homoiog 1 7.4 

114463 AA025370 Hs.40109 K1AA0872 protein 82 

114652 AA101416 Hs.1 07149 ESTs; WeaWysirrto to PTB-ASSOCIATED S 5.4 

35 114721 AA131450 Hs.103822 ESTs 43 

114730 AA133527 Hs331328 ESTs; Weakry similar to The KIAA01 38 gen 5.1 

114833 AA234362 H&37159 ESTs; Moderately similar to CQI-66 prate 53 

114860 AA235112 Hs.42179 ESTs; Moderately similar to similar to m 63 

114884 AA235811 Hs283672 ESTs 52 

40 114895 AA236177 Hs.76591 KIAA0887 protein 4.7 

114908 AA236545 HS34973 ESTs 52 

114932 AA242751 Hs.16218 KIAAQ903 protein 5.7 

115084 AA255566 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564C053 (fr 52 

115140 AA258030 H&279938 ESTs; Weakly similar to supported by GEN 53 

45 115468 AA287061 Hs48499 ESTs; Highly similar to Bdeight protein 4.7 

115583 AA398913 Hs45231 LDOC1 protein 73 

115709 AA412519 Hs38279 ESTs 43 

115772 AA423972 Hs.131740 ESTs 5 

115774 AA424Q29 Hs288390 ESTs; Moderately similar to dynamin; int 54 

50 115776 AA424038 Hs.81897 ESTs 5 

115821 AA427528 Hs.130965 ESTs; Weakly similar to ZINC FINGER PROT 137 

115955 AA446121 Hs.44198 Homo sapiens BAG done RG054D04 from 7q3 103 

116024 AA451748 Hs.83883 Human DNA sequence from clone 71 8J7 on c 63 - 

116108 AA457568 Hs28777 ESTs 6 

55 116117 AA459117 Hs31575 SEC63; endoplasmic reticulum transtocon 73 

116146 AA460701 K&15423 ESTs 53 

116298 AA489033 Ks32601 Homo sapiens mRNA; cDNA DKFZp586K131 8 (f 5.7 

116379 AA521472 Hs.71252 ESTs 53 

116393 AA599463 Hs.306051 protein phosphatase 2 (formerly 2A); reg 53 

60 116401 AA599963 HS39698 ESTs 73 

116416 AA609219 Hs39982 ESTs 92 

116587 059325 Hs.121429 ESTs 52 

116601 D80055 Hs.45140 ESTs 43 

116684 F09156 Hs36095 ESTs 72 

65 116722 F13654 HSFIH32 Stratagene cat#937212 (1992) Horn 53 

116766 H13260 HS35097 ESTs 52 

117453 N29568 Hs.108319 thyrofej hormone receptor-assoriated prat 63 

117557 N33920 Hs.44532 diublquffln 43 

117708 N45114 Hs.126280 ESTs 63 
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118001 N52151 Hs.47447 ESTs 114 

116229 N62339 Hs.166254 heal shock 90kDprotein1; alpha 62 

118599 N69207 Hs203697 ESTs 5.8 

118545 N7Q358 Hs.125180 growth hormone receptor 7.1 

118873 N89881 Hs44577 ESTs 6 

118985 N943Q3 H&55Q28 ESTs 9.3 

119107 R42424 Hs33841 ESTs 6 

119128 R45175 Hs.117183 ESTs 17J9 

119271 T16387 Hs35328 ESTs 6 

119367 T78324 Hs250895 ESTs 5 

119721 W69440 Hs.48376 ESTs 154 

119741 W7Q205 H&43670 Hnesln family member 3A 10.1 

119780 W72967 Hs.191381 ESTs; WeaWy slmBajtohypotheticeJ pro 53 

120217 Z41078 H&66035 ESTs 4.8 

120266 AA173939 HS2Q5442 ESTs; Weakly simfiar to inner centromere 83 

120294 AA190888 Hs. 153881 ESTs; Highly simtof to NY-RErf62 anlige 4.9 

120418 AA236010 HS26613 Homo sapiens mRNA; cONA DKFZp586F1323 (f 4.7 

120486 AA253400 Hs.137569 tumor protein 63 kDa with strong homolog 5.6 

120524 AA261852 Hs.1 92905 ESTs 4.9 

120571 AA280738 H&34892 ESTs 83 

120596 AA282074 Hs237323 ESTs 62 

120713 AA292655 Hs36557 ESTs 92 

120992 AA398248 HS37S94 ESTs 164 

121429 AM06293 Hs41167 ESTs 63 

121503 AA412049 Hs290347 ESTs 7.6 

121512 M412105 Hs.193738 ESTs 53 

121616 M424814 HSA8B27 ESTs 43 

122027 AA431302 Hs38721 EST; Weakly similar to N-coplne [H.sapie 53 

122294 AA437311 Hs38927 ESTs 5.7 

122411 AM46859 Hs39083 ESTs 63 

122791 AA460158 Hs.1 29836 K1AA1 028 protein 124 

122792 AA460225 Hs39519 ESTs 5.1 
122859 AA478539 Ha. 104336 ESTs 43 
123095 AA485724 HS27413 ESTs 54 
123100 AA4S5957 Hs306219 Homo sapiens clone 25032 mRNA sequence 5 
123295 AA495981 Ks250830 ESTs 4.7 
123311 AA496252 Hs.105069 ESTs 74 
123583 AA509006 Hs.111240 ESTs 9.1 
123619 AA609200 ESTs 4J 
123645 AA609310 Hs.188691 ESTs 43 
123709 AA609651 Hs.112742 ESTs 7 
123968 C14333 Hs.108327 damage-specific DNA binding protein 1 (1 5 
124178 H45996 Hs37101 putative Q protein-coupled receptor 63 
124352 N21626 Hs.102406 ESTs 102 
124357 N22401 yw37g07.s1 Morton Fetal Cochlea Homo sap 103 
124515 N58172 Hs.109370 ESTs 142 
124911 R88992 Hs.174185 ESTs 42 
125154 W38419 ESTs 4.7 
125992 W01626 za36e07x1 Soares fetal Over spleen 1KF 5.1 
126602 AA947601 Hs37056 ESTs 6.1 
126812 236290 Hs.1 73933 ESTs; WeaWy sMar to NUCLEAR FACTOR 1 43 
127080 AA662913 Hs.1 90173 ESTs 5 
127308 AA507628 Hs334390 ESTs 43 - 
127370 AKJ24352 Hs.70337 tamunogtobulin superfemfiy; member 4 4.7 
127386 AI457411 Hs.106728 ESTs 43 
127965 AA828760 H$292059 ESTs 43 
128172 AI400862 Hs265130 ESTs 5 
128305 A1Q39722 Hs279009 ESTs 53 
128420 A1088155 H&41296 ESTs; Weakly similar to unknown [risaple 17 
128467 AA176446 Hs.1 80428 ESTs; Weakly similar to hypothetical 43. 43 
128610 L38608 Hs.1 0247 activated leucocytB ceD adhesion moiecu 73 
128625 AA242816 Hs.1 02652 ESTs; WeaWy similar to K1AA0437 [H.sapi 8.1 
128651 AA446990 Hs.103135 ESTs 63 
129088 AA215971 Hs.194431 WAA0992 protein 52 
129138 N26391 H&250723 ESTs 6.1 
129171 AA234048 Hs.7753 catumenin 52 
129229 AA211941 Hs.109643 polyadenylate binding protoln-lnteracfin 63 
129386 N27524 Hs260024 Cdc42 effector protein 3 52 
129467 AA410311 Hs.44208 ESTs 5.1 
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129564 H22138 Hs.75295 guanylata cyclase 1 ; soluble; afcha 3 165 

129699 AA458578 Hs.12017 KIAAD439 protein; homotog of yeast ubkju 92 

129821 F11019 Hs.12696 cortactin SK3 domairvbindlng protein 85 

129823 X00948 Hs.105314 re!axfn2(H2) 9.1 

5 129847 W46767 H&296178 EST* WeaMy similar to RNA POLYMERASE I 5.4 

129912 AA047344 Hs.107213 ESTs; Highly similar to NY-REN-6 antigen 65 

129958 L20591 Hs.1378 annexinA3 5.1 

129977 JO4078 Hs.1395 eaity growth response 2 (Krox-20 prosop 8.6 

130061 U82256 Hs. 172851 arghase; type II 74 

10 130241 U78313 H&153203 MyoD family inhibitor 45 

130466 N21679 Hs. 180059 ESTs 55 

130541 X05608 H&211584 neuroftement; fight polypeptide (68kD) 6.7 

130619 AA477739 Hs. 12532 ESTs 64 

130925 N71935 Hs. 169378 muftiple PDZ domain protein 75 

15 130938 AA013250 H&21398 ESTs; Moderately slmlar to PUTATIVE GLU 62 

130971 H20332 Hs501444 signal sequence receptor; gamma (transio 6.4 

131066 F09006 H&22588 ESTs 5 

131126 F09012 Hs.181326 myotubularfn related protein 2 6.4 

131310 JQ2960 Hs5551 adrenergic; beta-2-; rexptor; surface 75 

20 131487 AA253220 H&Z7373 Homo sapiens mRNA; cDNA DKFZp56401763 (f 55 

131561 X59841 Hs594101 pre-B-cel] leukemia transcripfion factor 7.6 

131562 U90551 H&28777 H2A histone family; member L 5.1 
131579 N62922 Hs59088 ESTs 11 
131629 AA442119 H&238809 ESTs 45 

25 131682 AA428368 HS50654 ESTs 45 

131699 R68657 Hs50421 ESTs; Moderately slmflar to (HI ALU SUB 65 

131795 N32724 Hs52317 Sox-like transcriptional factor 5.6 

132053 H93381 H&38085 EST s; WeaMy simBar to putative glycine 72 

132122 U65092 Hs.40403 Cbp/p3(XWnteracting transactivator; wfl 55 

30 132191 AA449431 H&288361 KIAA0741 gene product 8 

132256 AA608856 Hs.431 murine leukemia viral (bmM) oncogene h 55 

132482 AA429478 H&238126 ESTs; Highly similar to CGI-49 protein [ 65 

132533 AA021608 Hs.172510 ESTs 55 

132572 AA448297 H&237B25 signal recognition partfclB 72kD 65 

35 132581 R42266 Hs52256 ESTs; Weakly similar to beta-TrCP protei 16 

132700 N47109 Hs5521 ESTs 65 

132701 AA279359 H&55220 BCL2-easodated athanogene 2 55 
132725 L41887 Hs.184167 splicing factor; argtnhe/serine-rich 7 75 
132783 N74897 H&278894 DEADiH (Asp-GkhAia-Asp/His) box poiypep 55 

40 132790 X75535 Hs.168670 peroxisomal famesylated protein 8 

132839 U76189 Hs51152 exostoses (mulfipteHike 2 55 

133142 F03321 Hs.65874 ESTs 52 

133342 U295B9 Hs.7138 cholinergic receptor; muscarinic 3 105 

133434 AA278852 Hs5Q212 ESTs 55 

45 133453 M68941 Ks.73826 protein tyrosine phosphatase; noiHecept 45 

133520 X74331 Hs.74519 prtmase; porypepfide 2A {58kD) .13.1 

133544 T33873 Hs.74624 protein tyrosine phosphatase; receptor t 45 

133608 013315 Hs.75207 glyoxalass! 45 

133626 H75939 Hs.75277 Homo sapiens mRNA; cDNA DKFZp586M141 (fr 5 

50 133633 D21262 Hs.75337 nucleolar phosphoprotein p130 65 

133797 S66431 Hs.76272 refimAlastorna-blnding protein 2 6 

133928 N34096 Hs.7766 ubkruitirKxmjugating enzyme E2E 1 (homo 5.4 

134095 U47414 Hs.76069 cycftiG2 52 - 

134249 N89827 Ks50667 RALBP1 associated Eps domain containing 65 

55 134321 AA416230 Hs5172 ESTs 7 

134453 X70683 Hs53484 SRY (sex determining region Y)-box 4 4.7 

134542 X57025 Hs55112 insuHn-fike growth factor 1 (somatomedl 77 

134570 U66615 Hs. 172280 SW1/SNF related; matrix associated; actJ 6.4 

134592 U82613 Hs289104 Alu-blnding protein wifli zinc finger dom 54 

60 134654 W23625 Hs5739 ESTs; Weakly similar to ORF YGR200c [§£ 5 

134666 AA482319 Hs.8752 putative type n membrane protein 64 

134806 Z49099 Hs59718 spermine synthase 6.7 

134951 AA431480 Hs.169358 ESTs 95 

135066 X04602 Hs53913 Interteukin 6 (interferon; beta 2) 5.7 

65 135155 AA358268 Hs.166556 ESTs; Moderately similar to transcripto 45 

135411 L10333 Hs.99947 retJculonl 55 

300023 M10098 AFFXcontrot 18S ribosomaJ RNA 4.6 

300254 AW079607 Hs55610 ESTs; WeaJdy similar to ZnT-3 [risapians 75 

300273 AW013907 Hs.167531 ESTs; Moderately similar to predicted us 115 
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300566 
300578 
300671 
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300810 
300813 
3X823 
300834 
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301042 
301242 
301254 
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301563 



10 



15 



20 



25 



30 



35 



40 



45 303125 



301783 
301805 
301646 
301891 
302005 
302056 
302067 
302099 
302147 
302214 
302236 
302358 
302410 
302486 
302582 
302785 
302782 
302881 



302970 
302977 



303309 
303344 
50 303380 
303401 



55 



303540 
303572 



303702 
303718 
303722 
303732 
303735 
303752 
303753 
303813 



304218 



AW157648 Hs.1 53506 ESTs; Weakly similar to mfcrotubuSs-acfi 8.5 

H86709 H&326392 son of sevenless {DrosophQa} homotog 1 5.8 

AI989417 Hs.1 34289 ESTs 44 

AI239706 Hs.93810 ESTs 7.9 

AA039352 Hs.125034 ESTs;Weaidy8lmaartoORFYDL040c[S.c 4.5 

AW468066 H&24817 ESTs; Weakly similar to KIAA0986 protein 52 

AM97778 H&2Q509 ESTs 6.4 

AW76890 Hs.146847 ESTs 5.8 

AA406411 H&208341 ESTs; Weakly similar to K1AA0989 protein 10j6 

AI863068 Hs.106823 ESTs; Weakry similar to putative zinc fl 5.6 

AF109300 Hs.147924 ESTs 6.7 

AW136372 Hs.1852 ESTs 7.6 

AA593373 H&293744 ESTs 53 

AA947682 H&20252 ESTs; Weakly similar to Chain A; Cdc42hs 7 

AI659131 Hs.1 97733 ESTs 245 

AW161535 Hs.23782 ESTs 1UB 

AKW9624 H&283390 EST cluster (not biUnlQene) with exonh 4.3 

H29500 Hs.7130 ESTs; Moderately similar to N-copine [H. 43 

AA156879 H&262G38 ESTs; Weakfy similar to Z3NC FINGER PROT 6.6 

AI8Q2946 Hs.44208 ESTs; Weakty similar to matah to ESTs AA 5.7 

AW008475 Hs.151258 EST duster {not tn Unpens) with exon h 6.8 

Z44810 Hs30t789 ESTs; Weakly similar to similar to Cele 63 

AL046347 HSJB3937 Homo sapiens PAC clone DJ1 159004 from 7p 62 

A1800004 Hs.142846 ESTs; Weakry slmSar to MesP1 [Mjnusculu 8.5 

R20002 Hs.6823 ESTs; Weakly similar to Intrinsic factor 4.6 

AF131855 H&27B591 Homo sapiens done 25056 mRNA sequence 6.3 

A1869666 Hs.123119 ESTs 363 

A1457532 Hs30488 ESTs; Moderately similar to ROSA26AS [M. 9.5 

K05698 Hs222399 ESTs; Weakly slmfar to protein-tyrosme 53 

AL021397 Hs.137576 rfbcsomal protein L34 pseudogene 1 8.8 

AB022660 Hs.151717 KIAA0437 protein 5.9 

AJ001454 Hs.159425 Homo sapiens mRNA for testam-3 43 

A1128606 Hs.6557 zinc finger protein 161 43 

D81150 H&322848 EST duster (not In UniGene) with exonh 5.5 

NMJJ04917 H&218366 EST duster (not tn UniGene) wfth exon h 263 

AC003682 Hs.183512 multiple UniGene matches 8.2 

NMJ000522 H&249195 EST duster (not in UniGene) with exon h 6.4 

AA425562 Hs.1 1065 EST duster (not In UniGene) with exonh 5 

AA343696 Hs.46821 ESTs; Weakry slm2ar to putafive [H^api 4.8 

AA508353 Hs.1 05314 relaxin 1 (HI) 783 

N58545 H&42346 histone deacetylase 3 83 

AW118352 HS312679 EST duster (not In UniGene) wOh exonh 74 

AW263124 HS315111 EST duster (not In UniGene) with exonh 53 

AF199613 EST duster (not in UniGene) with exonh 4.6 

AF161352 Hs.111782 EST duster (not tn UniGene) w9h exon h 53 

AI571580 Hs.170307 ESTs 43 

AA215297 Hs31441 EST duster (not In UniGene) with exonh 64 

AL134164 Hs.145416 ESTs 63 

AA255977 H&250646 ESTs; Highly similar to ubiaultin-conjug 193 

AA2S8471 HS326567 EST duster (not In UniGene) with exonh 63 

AA758552 HS309497 ESTs 63 

AW516519 H&273294 ESTs 43 

AA348111 HS36900 ESTs 12.1 - 

AA355607 Hs309490 ESTs; WeaWy slmBar to MMSET type l[H. 8.2 

AW338520 Hs.242540 ESTs 8.4 

AW500106 Hs*3643 EST cluster (not In UniGene) with exon h 4.9 

D30891 Hs.1 9525 EST duster (not in UniGene) with exonh 15.7 

AW500748 Hs.224951 ESTs; Weakly similar to 73 kDA subunlt o 63 

AI741397 Hs.114658 ESTs 43 

AA521510 Hs.145010 ESTs 123 

AW5Q2405 Hs.1 25759 ESTs; WeaWy similar to tumor suppressor 43 

AA707750 Hs.1 69055 ESTs; WeaWy similar to ds-Golgi matrix 5.4 

AI017286 HS3957 EST duster (not in UniGene) wifo exon h 53 

AW503733 Hs.9414 ESTs 13 

AI275850 Hs.114658 EST duster (not In UniGene) wiffi exonh 73 

R00493 Hs.125565 translocaseof Irmrmftoctandnalmembr 43 

N66373 Hs27973 ESTs; Weakry similar to ZK354.7 [Celega 6 

AA668128 H&45207 EST singleton (not in UniGene) with exon 5.7 

AI024916 H&251354 ESTs 5.7 
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307848 At364186 


EST singleton (not in UnlQene) with exon 


75 


307871 A1368865 


Ks.31476 EST singtetnn (not In UnlQene) with exon 


54 


308050 A1460004 


H&31608 EST singleton (not (n UniGene) wBh exon 


8.1 


308362 AI613519 


Hs. 105749 EST singleton (not In UnlQene) with exon 


5.5 


308923 AI863051 


Hs27B815 ESTs 


4.4 


309116 A1927149 


H&29797 ribosoma! protein L1 0 


A3 


309375 AW075342 


Hs5271 EST singleton (not in UniGene) with exon 


7.4 


309574 AW205604 


H&266009 ESTs; Weaty similar to iffl ALU SUBFAMI 


5 


310055 AJ921750 


Hs.144871 ESTs 


5 


310098 AI 585841 


Hs.1 61 354 ESTs 


11J6 


310250 AM78629 


Hs. 158465 ESTs 


55 


310365 AI262148 


Hs.145569 ESTs 


9.7 


310382 AI734009 


Hs.127699 EST duster (not In UniGene) 


104 


310409 AI612775 


Hs.145710 ESTs 


4.6 


310431 AI420227 


Hs. 149358 ESTs 


72J9 


310573 AW292180 

wiwiw nil**" iw 


Hs.156142 ESTs 


75 


310598 A1338013 


Hs.140546 ESTs 


92 


310639 AW269082 


Hs.175162 ESTs 


4.5 


310787 AW262580 


Hs.147674 ESTs 


4.9 


310816 AI973Q51 


Hs.224965 ESTs 


75 


311251 A1655662 


Hs.197698 ESTs 


415 


311280 AI767957 


Hs.198248 ESTs; Weakfy similar to Y38A8.1 gene pro 


4.5 


311330 AI678524 


H&201629 ESTs; Moderately similar to Hfl ALU SUB 


4.6 


311515 AW136713 


H&23862 ESTs 


5.9 


311574 AI824863 


Hs.211420 ESTs 


4.8 


311587 A18282S4 


H&271018 ESTs 

1 Wim V IV IV LV 1 9 


55 


311596 AI682Q88 


Ho 70375 ESTs 


264 


311631 AI809519 


H&27133 ESTs 


6.4 


311688 AWD25661 


H&240090 ESTs 


7.4 


311783 AIS82478 


Hs.13528 EST 


4.6 


311828 AA76S470 


Hs.85092 ESTs 


6.7 


311853 AWD14013 


Hs 107056 ESTs 

1 19. 1 WI WWW l»W 1 W 


5.3 


311601 R16890 


Hs 137135 ESTs 


5.6 


311632 A WAS 1 RCA 


H&257482 ESTs 


45 


312153 AA759250 


Hs. 118625 cytochrome b*561 


11 


312182 AAS34800 


Hs526263 EST duster (not ki UniGene) 


165 


312242 AI38Q207 


Hs. 125276 ESTs 


4.7 


312296 C013S7 


Hs 127128 ESTs 


55 


312407 R4618Q 


Hs. 153485 ESTs 


62 


312424 AA847398 

W IfcfW nrW#*T* Www 


Hs.291997 ESTs 


45 


312425 R49353 


Hs.293892 ESTs 


55 


312480 R68651 


Hs.144997 ESTs 


95 


312518 CI 7785 

tflktflQ will Uu 


Hs.1 82738 ESTs 


65 


312521 AAD336Q9 


Hs 939884 ESTs 


115 


312527 AI695522 


Hs.191271 ESTs 

1 1^191*1 • UV 1 9 


4.7 


312539 AI004377 

W IbWW IWATtWl 1 


H&200360 ESTs 

1 WfcMWWW WW 1 w 


7 


312546 A1623511 


Hs.1 18567 ESTs 

1 lv» • I WW I h»W 1 9 


5.1 


312563 AA976064 


Hs.1 80842 ESTs 


65 


312623 AA694607 


Hs.176956 EST duster (not in UniGene) 


105 


312857 AA772279 


Hs.126914 ESTs 


5 


312890 AI813654 

O IfcWVW IWI 1 WWV1 


Hs-5657 ESTs 


55 


312903 AA939266 


H&278626 ESTs 


7.7 


312905 H92571 


Hs_234478 ESTs 

np r 13*1*1/ o lw 1 9 


65 * 


312976 AA836271 


Hs.1 25830 ESTs 

9 t>9* V fcft#-91*W# L*W 1 9 


45 


312983 A1079278 


H&269899 ESTs 


5.1 


312998 AA249018 


Hs.154331 EST duster (not h UniGene) 


7 


313035 N36417 


H&144928 ESTs 


65 


313166 AJ801098 


Hs.151500 ESTs 


45 


313188 AJ039702 


Hs.179573 collagen; type!; alpha 2 


45 


313218 AA827805 


Hs.124298 ESTs 


5 


313226 AI200281 


Hs.123910 ESTs 


55 


313325 A1420611 


Hs.127832 ESTs 


45 


313328 AI08812O 


Hs.122329 ESTs 


74 


313425 AA745689 


Hs.186638 ESTs; WeaWy slmnar to sanflar to zinc 


65 


313499 AI261390 


Hs.146035 ESTs 


55 


313540 A1797301 


H&5740 ESTs 


55 


313568 AW467376 


Hs.129640 ESTs 


45 


313569 AJ273419 


Hs.135146 ESTs; WeaJdy sfmflaf to ZK1058.5 [Ceteg 


45 


313603 AW468119 


H&287631 EST duster (not to UniGene) 


65 
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313615 AW285194 Hs301997 OKFZP434N126 protein 52 

S1S62S AW466402 Hs354G20 ESTs 73 

313634 AA688292 Hs337786 ESTs 44 

313635 AA507227 H&3390 ESTs 8.1 
5 313638 AI753075 Hs.104627 ESTs 67 

313670 C16690 Hs.23767 EST duster (rat in UniGene) 44 

313671 W49823 Hs.104613 ESTs AA 
313676 AA861697 Hs.120591 EST cluster (not in UniGene) 134 
313703 AI161293 H-L280380 ESTs; WeaWy similar to K1AA0525 protein 10 

10 313712 AA768S53 Hs.74170 ESTs 53 

313800 AW296132 Hs35Q98 ESTs 5.4 

313979 AJ535895 HS221024 ESTs 43 

314121 AI732100 Hs.187619 ESTs 133 

314123 AW245993 Hs223394 ESTs 6.4 

IS 314171 AK21895 Hs.193481 ESTs 294 

314188 AL138431 Hs.164243 ESTs . 43 

314219 AL036001 Hs48376 ESTs 5.7 

314236 AA743396 Hs. 189023 ESTs 43 

314237 AA732359 Hs36264 ESTs AA 
20 314284 AA731431 H&293484 EST cluster (not in UniGene) 64 

314305 A1280112 Hs.125232 ESTs 53 

314343 AI754701 H&32B476 ESTs; WeaMy similar to alternatively sp 62 

314530 AI052358 Hs.193726 ESTs 43 

314691 AW207206 Hs.136319 ESTs 17 

25 314695 AW5Q2698 Hs.1 18152 ESTs 83 

314785 A1538226 Hs32976 ESTs 94 

314601 AA4810Z7 Hs.109045 ESTs; WeaMy similar to ORF YGR245c [S.c 8 

314664 AA493811 Hs294068 ESTs 6 

314907 AI672225 Hs222888 ESTs 193 

30 314916 AA548806 Hs.122244 ESTs 43 

314954 AAS21381 Hs.187726 ESTs 53 

3t4981 AAS24953 Hs393334 ESTs 43 

315021 AA533447 Hs312989 EST cluster (not in UniGene) 5.1 

315051 AW292425 Hs.163484 EST 153 

35 315052 AA878910 Hs.134427 ESTs 20 

315073 AW452948 H&257631 ESTs 53 

315084 AI821D85 ESTs 83 

315214 AI915927 Hs34771 ESTs 54 

315220 AI420753 Hs.66731 ESTs &1 

40 315278 AI885544 Hs.12450 ESTs 53 

315282 A1222165 Hs.144923 ESTs 43 

315368 AW291663 Hs.104696 ESTs 6 

315369 AA764918 HS356531 ESTs 43 
315378 AI263393 Hs.145008 ESTs 63 

45 315379 AI378329 Hs.126629 ESTs 54 

315402 AVI/293424 Hs,75354 ESTs 5.1 

315442 AA977935 Hs.127274 ESTs 63 

315443 AW003416 Hs.160604 ESTs 53 
315528 R37257 Hs.184780 ESTs 8.1 

- 50 315593 AW198103 Hs.158154 ESTs 93 

315634 AA837085 H&220585 ESTs 73 

315705 AW449285 Ks313636 ESTs 83 

315707 AI418055 Hs.181160 ESTs 5.1 - 

315714 AA744015 Hs398138 EST cluster (not In UniGene) 6.1 

55 315740 T05558 Hs.156880 EST cluster (not in UniGene) 63 

315762 AI391470 Hs.158618 ESTs 53 

315769 AA744875 Hs.189413 ESTs 5 

315843 AA679430 Ks.191897 ESTs 5.7 

315990 AI800041 Ks. 190555 ESTs 93 

60 316012 AA764950 Hs.1 19898 ESTs 43 

316036 AA708016 Hs.180389 ESTs 53 

316055 AA693860 Hs.6947 EST cluster (not in UniGene) 6.7 

316074 AW517542 Hs383273 ESTs 53 

316100 AW203986 Hs313003 ESTs 5,1 

65 316169 AI127483 Hs.120451 ESTs 83 

316442 AA760894 Hs.153023 ESTs 17.1 

316491 AA766025 Hs.186854 EST 43 

316504 AW135854 Hs.1 32458 ESTs 43 

316667 AW015940 H&232234 ESTs 73 
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316854 AA831215 Hs.159066 ESTs; WeaJdy sfmllar to predicted using 5.1 

316905 AW138241 H&210846 ESTs 64 

317008 AW051597 Hs.143707 ESTs 44 

317019 AA864968 Hs.127699 ESTs 11 

317194 AW445167 Hs.126036 ESTs 135 

317224 D56760 Hs.93029 ESTs 8.7 

317404 AI806B67 Hs.1 26594 ESTs 8.7 

317501 AA931245 Hs.137097 ESTs 11.1 

317548 AI554187 Ks.195704 ESTs 145 

317651 AW282779 Hs.169799 ESTs 55 

317758 A/733277 Hs.128321 ESTs 54 

317850 N29874 Hs.152982 EST duster (not tn UniGene) 114 

317869 AW295164 Hs.129142 EST s; Weakly similar to DEOXYRIBONUCLEAS 135 

317902 AI828602 H&211265 ESTs 55 

317916 AI565071 Hs.159983 ESTs 7.7 

318239 AI085198 Hs.164228 ESTs 13.1 

318268 AI817736 Hs.182490 BSTs 65 

318327 AW294013 Hs.200942 ESTs 45 

318363 R45530 Hs.1440 gamma-amlnobutyric add (GABA) A recepto 6 

318428 A1949409 H&194591 ESTs 125 

318464 AI151010 Hs.157774 ESTs 45 

318524 AW291511 Hs.1 59066 ESTs 255 

318540 T30280 H&274803 EST duster (not in UniGene) 7 

318591 AW206806 Hs.1 15325 ESTs 45 

318615 AI133817 Hs.10177 ESTs 55 

318646 AW175665 H&Z78695 ESTs 57 

318667 AI493742 Hs.165210 ESTs 11 

318668 W26276 H&138075 ESTs 55 
318753 AA576265 Hs.7130 copinelV 55 
319080 Z45131 Hs53023 ESTs 165 
319181 F06504 H&27384 EST duster (not In UniGene) 45 
319191 AF071538 Hs79414 prostate epfihefiunvspecffc Ets transcr 65 
319233 R21054 Hs, 180532 ESTs 45 
319586 D78808 Hs583683 ESTs 85 
319750 AA621606 Hs.1 17956 ESTs 95 
319763 AA460775 Hs.6295 ESTs 145 
319824 AA424266 Hs.123642 EST duster (not in UniGene) 125 
319838 AA337642 Hs55262 nuclear factor related to kappa Bbindfn 5.1 
319913 AA179304 Hs571588 ESTs; Moderately simflar to 111! AUU SUB 45 
319964 TB0579 H&29Q27D ESTs 55 
320076 AI653733 H&271593 ESTs 85 

• 320102 AW29B219 Hs.115325 RAB7; member RAS oncogene farnfly-like 1 95 

320187 T99949 H&503428 EST duster (not in UniGene) 95 

320211 AL039402 Hs.125783 DEME-6 protein 75 

320324 AF071202 Hs.139336 ATP-Wnding cassette; sub-family C (CFTR 565 

320455 R49889 H&24144 EST cluster (not in UniGene) 85 

320464 AI089817 H&237146 ESTs 54 

320561 NMJX36953 Hs.159330 EST cluster (not in UniGene) 7 

320574 AL049443 Hs.161283 Homo sapiens mRNA; cDNA DKFZp586N2G20 (f 44 

320576 AL049977 Hs.162209 Homo sapiens mRNA; cDNA DKFZp564C122 (fr 6.7 

320654 AW263086 Hs.118112 ESTs 6 

320796 AF038966 Hs51218 secretory carrier membrane protein 1 135 

320800 AI661006 Hs.71721 ESTs 65 - 

320813 AW360847 Hs.16578 ESTs 95 ' 

320853 A1473798 Hs.135904 ESTs 8.1 

320356 059945 Hs55368 EST duster (not tn UniGene) 6 

320899 AA633772 Ks.116796 ESTs 95 

320918 AW195012 H&293970 ESTs 5 

320973 H19732 Hs547917 ESTs 55 

321099 AA018386 H&54341 ESTs 45 

321190 H52462 Hs.163872 EST duster (not In UniGene) 55 

321318 AB033041 Hs.137607 EST duster (not in UniGene) 84 

321382 AW372449 Ks.175982 EST cluster (not tn UniGene) 75 

321441 AW297633 Hs. 11 8498 ESTs 147 

321538 H80483 Hs.46903 EST duster (not in UniGene) 95 

321609 H86021 Hs, 182538 ESTs; Weakly slmBar to hMmTRAlb [H.sapi 45 

321636 AT791838 Hs.193465 ESTs 55 

321638 AI358352 Ha.108932 ESTs 45 

321644 AI204177 H&537396 ESTs 6.6 
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65 



321681 


AA233821 


. Hs.190173 EST duster (not in UniQene) 


4.6 


321726 


X91221 


Hs.144465 EST cluster (not in UniGene) 


5 


321758 


U29112 


H&196151 EST duster (not In UniGene) 


62 


321877 


AL109784 


Hs.189222 EST duster (not In UniGene) 


4.6 


321899 


N55158 


H&29468 ESTs 


4j6 


321902 


AA746374 


K&145010 ESTs 


B2 


322007 


AW410546 


Hs.164649 ESTs 


5.1 


322055 


AL137646 


Hs.146001 EST cluster (not In UniGene) 


45 


322092 


AF085833 


H&135624 EST cluster (not In UniGene) 


45 


322221 


AI890619 


Hs.1 79662 nucleosome assembly protein 1-fIke 1 


AA 


322278 


AF086283 


EST cluster (not In UniGene) 


5.8 


322303 


W07459 


Hs.157601 EST duster (not in UniGene) 


22 


322437 


AW393804 


Hs.1 70253 ESTs; Weaidy similar to rabaptM [Usa 


44 


322493 


AF143235 


H&279819 EST duster (not in UniGene) 


72 


322782 


AA056060 


H&2Q2577 EST duster (not in UniGene) 


184 


322811 


AA782292 


Hs.105872 ESTs 


6,9 


322818 


AW043782 


H&293616 ESTs 


ia7 


322826 


A1807B83 


Hs.180059 ESTs 


5 


322887 


AI986306 


Hs56149 ESTs; Weakly similar to KIAA0969 protein 


115 


322889 


AA081924 


Hs.124918 ESTs 


7.1 


322924 


AA669253 


Hs.1 36075 ESTs 


45 


322982 


AI351191 


Hs.128430 ESTs 


65 


322994 


AA422116 


Hs.191461 ESTs 


4.7 


323040 


AA336609 


Hs.10862 ESTs 


6.9 


323041 


AL1 18747 


H&26691 EST duster (not in UniGene) 


85 


323045 


AA148950 


Hs.188836 ESTs 


4.6 


323048 


AL11B923 


Hs.175110 EST duster (not in UniGene) 


75 


323070 


AA157726 


Hs264330 ESTs 


75 


323071 


AA157867 


Hs5722 ESTs 


4.7 


323097 


Z44354 


H&296261 guanine nucleotide binding protein (G pr 


45 


323131 


AA176982 


K&270124 EST duster (not in UniGene) 


6.1 


323136 


AL120351 


K&30177 EST duster (not in UniGene) 


4.3 


323175 


AI827137 


HSL336454 ESTs 


62 


323216 


AF131846 


Hs.1 3396 Homo sapiens done 25028 mRNA sequence 


65 


323226 


AF055019 


H&21906 Homo sapiens done 24670 mRNA sequence 


12.6 


323236 


AA363148 


H&293960 ESTs 


105 


323262 


AI829770 


Hs.190642 ESTs 


75 


323276 


AA836452 


Hs523822 ESTs 


75 


323287 


AA639902 


H&104215 ESTs 


24.7 


323335 


AI655499 


Hs.161712 ESTs 


14.1 


323341 


AL134875 


Hs.108646 ESTs 


55 


32ff362 


AL135067 


Hs.117182 ESTs 


6.1 


323486 


C05278 


H&299221 ESTs; Moderately similar to [PYRUVATE DE 


85 


323496 


AI826801 


H&300700 ESTs 


45 


323507 


H71721 


Hs.128387 ESTs 


4A 


323545 


AI814405 


Hs524569 ESTs 


55 


323623 


AA314280 


Hs.1 46589 EST duster (not in UniGene) 


5 


323683 


AW263526 


H&243023 ESTs 


7.7 


323691 


AA317561 


Hs.1 45599 EST duster (not in UniGene) 


55 


323810 


AA740405 


Hs.1 08806 ESTs 


6.2 


323846 


AA337621 


Hs.137635 ESTs 


6 


323929 


AA354940 


Hs.145958 ESTs 


10.7 


323959 


AI636775 


Hs.6831 ESTS 


SA ~ 


323996 


AA367032 


H&217882 ESTs 


55 


323997 


AA844907 


H&274454 EST duster (not in UniGene) 


AA 


324019 


AW177009 


EST duster (notm UniGene) 


45 


324130 


AL046575 


Hs.130188 ESTs 


11 


324295 


AI146686 


Hs.1 43591 ESTS 


13.7 


324296 


A1524039 


Hs.1 92524 ESTs 


65 


324307 


AA627642 


Hs.4994 transducer of ERBB2; 2 (TOB2) 


45 


324330 


AA884766 


EST duster (not in UniGene) 


45 


324385 


F28212 


H&284247 EST cluster (not in UniGene) 


4.7 


324430 


AA464018 


Hs.184598 EST duster (not in UniGene) 


135 


324452 


AW014022 


Hs.170953 ESTs 


75 


324547 


AW5O1074 


Hs.74170 ESTs 


55 


324603 


AWD16378 


H&292934 ESTs 


ZA2 


324617 


AA508552 


Hs.195839 ESTs 


54 


324618 


AI346282 


H&87159 ESTs 


45 


324620 


AA448021 


HsJ4109 EST duster (not in UniGene) 


5.7 
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324676 
324691 
324696 
324713 
324715 
324718 
324720 
324752 
324753 
324780 
324801 
324804 
324845 



324961 
325108 
20 326816 



25 



AA361016 
M564134 
A1741633 
AA613792 
AA401863 



H&22380 



327098 



330020 
330211 



330430 
330546 
330551 



330700 
330704 
330705 
330706 
330712 
330725 
330732 
330752 
330763 
330772 



330949 
330977 
331017 
331099 
331128 
331151 
331195 
331320 
331321 
331337 
331348 



331422 
331442 
331466 
331479 
331490 
331493 
331561 
331615 



331811 



AI685464 ESTs 
AI694767 Hs.129179 ESTs 
AW503943 Hs.1 12451 ESTs 

AJ217963 Hs293341 ESTs; WeaWy similar to Pro-a2(XI) [H-sa 

AA641092 Hs257339 ESTs 

AW340249 Hs.1 63440 ESTs 

AI739168 Hs.131798 EST cluster (not In UniGene) 

AI557019 Hs.1 16467 ESTs 

AA576904 H&292437 ESTs 

A1279919 Hs^72072 ESTs; Moderately similar to IIU ALU SUB 
AA612626 Hs.144871 EST duster (not In UniGene) 
AI334367 Hs.159337 ESTs 
AI81S924 Hs.14553 ESTs 
ESTs 

Hs.337533 ESTs 
Hs.136102 WAA0853 protein 
Hs.125350 ESTS 

EST cluster (not in UniGene) 
ESTs 

CH20Jisgl 6552458 
CH21_hsgI|5867660 
CH21_hsgi 6682516 
CH.07_hsgi 5868455 
CHX.hsgl|5868837 
CH.16jj2gi|6165201 
CH.16_p2gi|5091594 
CH.16j)2gi|6671887 
CH.05_p2gi|6013592 
M23263 androgen receptor (dlhydrotestosterone r 

HG2261-HT2352 H&321110 
U31 382 H&299867 guanine nucleotide binding protein 4 
U39840 hepatocyte nuclear factor 3; alpha 

AA319514 H&30732 ESTs 
AA037415 H&209S9 ESTs 
AA056557 Hs.6759 ESTs 
AA1Q2571 Hs.157078 ESTs 

AA121140 Hs.177576 ESTs; Moderately similar to kynurenine a 
AA167269 H&52620 ESTs 

AA252033 H&24052 ESTs; Weakly similar to Ifli ALU SUBFAMi 
AA261092 Hs35254 ESTs 

AA449677 Hs.15251 Human DNA sequence from done 437M21 on 

AA450200 Hs.143187 FK506-bInding protein 3 £5kD) 

AA479114 Hs.11356 ESTs 

D60374 EST 

AA149579 Hs.91202 ESTs 

H01458 Hs.142896 ESTs 

H20826 Hs.315181 ESTs 

N24619 Hs.108920 ESTs 

R36671 Hs.14846 ESTs 

R51361 Hs268714 ESTs 

R82331 Hs268838 ESTs 

T64447 Hs.168439 ESTs 

AA262999 H&300141 ESTs 

AA2783S5 Hs.87929 ESTs 

AA287662 Hs.1 18630 ESTs 

AA400596 Hs38143 ESTs 

AA416979 Hs31897 ESTs 

AA454543 Hs.43543 ESTs 

F10802 Hs237339 ESTs; Moderately similar to HI! ALU SUB 
H77381 Hs.41223 ESTs 
N21680 Ks.43455 ESTs 
N27154 Hs.44076 ESTs 

N32912 H&291039 ESTs; WeaWy similar to hypothetical 43. 

N34357 H&93817 ESTs 

N62780 Hs.48703 ESTs 

N92352 Hs5472 ESTs 

W48868 Hs.334305 ESTs 

238907 Hs.65949 KIAA0888 protein 

AA404500 Hs.187958 ESTs 
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43 

10.6 

102 

65 

72 

34.4 

4.8 

7.9 

52 

73 

12.6 

65 

AS 
44 
6.5 
5.1 
7.1 
95 
43 
AM 
53 
A3 
55 
7.6 
6 

12.6 
9 

Antigen, Prostate Specific Aft. Splice 
6 

4.9 
6 

5.5 

5.1 

11.7 

145 

5 

72 

AM 

185 

4.3 

5.8 

4.6 

1&3 

103 

4.4 

113 

113 

4.8 

13 

AM 

AM - 

6.1 

92 

93 

43 

43 

43 

73 

5.4 

63 

125 

43 

92 

45 

8.7 

103 

43 
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331848 AA417039 Hs.98268 signal recognition particte 72kD 73 

331873 AA429445 H&98640 ESTa 63 

331889 AA431407 H&98802 Homo sapiens Chromosome 16 BAG done CTT 333 

331967 AA460158 Hs39589 KIAA1028 protein 63 

5 331974 AA464518 Hs.105322 ESTs 53 

332043 AA490831 Hs201591 ESTs 103 

332076 AA599477 Hs291156 ESTs 4.4 

332173 F09281 Hs.1 00725 ESTs 63 

332247 N58172 ESTs 142 

10 332249 N62Q96 Hs.194140 ESTs 72 

332325 T79428 Hs339667 ESTs 53 

332396 AA340504 ESTs; Weakly similar to simflarto human 212 

332434 N75542 Hs237731 transcription factor 4 153 

332493 N95495 Hs36729 ESTs; Highly similar to GTP-binding prot 7.1 

IS 332522 L38503 Hs.1 78357 glutathione S-tnyurferase meta 2 63 

332526 AA281753 Hs.17731 Inositol 1 ^triphosphate receptor; ty 53 

332530 M31682 Ha 19280 inhibtn; beta B (acthdn AB beta potypep 53 

332533 M99487 Hs325825 folate hydrolase (prostate-specffic memb 38.1 

332538 N48715 HS20991 ESTs 63 

20 332546 D84454 Hs22587 solute carrier family 35 (UDP-gakctose 43 

332594 AA279313 Hs32951 methyl CpQ binding protein 2 53 

332610 AA412405 Hs.40513 ESTs; Weakly shntof to BETA GALACTOSIDA 53 

332661 N95742 HS3390 ESTs 63 

332697 T94885 Hs.75725 carboxypepfidase E 243 

25 332712 D25070 Hs.79306 Inositol 1 ^triphosphate receptor; ty 93 

332716 L00O58 Hs.79630 v-myc avian myelocytomatosls viral oncog 53 

332726 R72029 Hs33428 synaptophysHB protein 5 

332781 AA233258 ESTs; Weakly similar to Dl 0075 [Cetega 43 

332797 CH22J=GENES.6_2 303 

30 332798 CH22_FGBJES£_5 663 

332799 CH22J=GENES3_6 193 

332933 CH22_FGENES38_7 53 

332980 CH22_FGENES54J 53 

332884 CH22.FGENES34 J 43 

35 333168 CH22J=G^ES34J 4.7 

333169 CH23J=GENES34_2 44 

333452 CH22_FGENES.157J 43 

333456 CH22J=GENES.157_5 43 

333458 CH22J=GB^ES.157_7 43 

40 33361 1 CH22J=GBIES217J5 4.7 

333621 CH22J=QENES219J5 53 

333814 022_FGENES282_2 7.1 

333849 CH22_FGB<ES290_B 62 

333949 CH2*J=GENES303j5 43 

45 333951 CH2^J=GENES303_7 43 

333955 CH22J=QENES.303J1 53 

334150 CH22.FGENES.339J 5.1 

334223 CH2^FGBIES360j4 203 

334297 CH22LFGENES372J3 9.4 

50 334443 CH2^JGENES387 - 2 43 

334444 CH22J=GENES387_4 53 

334447 CH2*_FGENES387_7 13.1 

334570 CH22J=GENES.405J1 SA - 

334749 CH22_FGENES.427J 53 

55 334777 CH22_FGENES.430J9 4.7 

334960 CH22J=GENES.465_29 52 

335179 CH22J=GENES304J 83 

335293 CH22J=GENES327JB 4.7 

335550 CH22J=GENES376J1 5.1 

60 335581 CH22LFQENES381J9 5.7 

335586 CH22_FGBJES381_25 43 

335809 CH22_FGENES317_6 62 

335810 CH2^.FGENES317_7 53 
335822 CH22J=GENES319J7 7.1 

65 335824 CH22J=GENES319J1 83 

335853 CH22_FGENES326J 43 

335886 CH22J^B4ES332_4 43 

336034 CH22J=GENES378J 63 

336441 CH2^FGBIES327J 73 
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336624 


CH22_FGENES.6-3 


43.3 




336625 


CH22_FGB^ES£4 


37.9 




336679 


CH22J=GENES>43-7 


5.3 




337577 


C^a_C65E1.GENSCAN^1 


45 


5 


338255 


CH22_Btf:AC005500.GENSCAN .276-3 
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338260 


CH22^EM^C005500.GBJSCAN279-10 


4.6 




338561 


CH22_EM^C005500.GENSCAN421-5 


4.6 




338562 


CH22_EM^C005500.GENSCAN.421-6 


4.3 


10 


338759 


CH22_EMAC005500.GENSCAN517-6 


5.1 


338763 


CH2a.EMJ^005500.GENSCAN^17-16 


5.5 




338764 


CH22LEM^C005500X3ENSCAN517-17 


7.1 
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TABLE 3A shows the accession numbers for those primekeys lacking unigenelD's for Table 
3. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed Gene clusters were compiled using sequences derived from Genbank ESTs 
5 and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column, 

10 Ptey: Unique Eos probeset Identifier number 

CAT number Gene duster number 

Accession: Genbank accession numbers 

IS Ptey CAT number Accession 

123619 371661 1 AA602964 AA609200 

116722 143512L1 Z24S7B AA494098 F13854 AA434040 AA143127 

103677 41847J Z83806 AJ132091 AJ1 32090 

20 125992 1589048 J H48372W01626 

109342 genbanK_AA213620 AA213620 

125154 genbanK.W38419 W38419 

101447 emrez_M21305 M21305 

124357 genbank_N22401 N22401 
25 108910 genban)LAA136590 AA136590 

322278 47271.1 W69304 AF0862B3 WB9200 

315084 350959.1 AI821085AW973464AA554802 AI821831 AA657438 AA640756 AA650339 

324019 262782J AW177009AI381610 
324330 300543.1 AA884766 AW974271 AA592975 AA447312 

30 324626 336411.1 AI685464 AW971336 AA513587 AA525142 

303029 37699 1 AF199613 AF108756 

324804 398093J AJ692552 AJ393343AI800510AB77711 R24263AA661878 

324961 37Q239J AA613792AW1 82329 T05304AW858385 

329362 QJLhs 
35 336624 CH22_4071FQJL3_ 

336625 CH22_4072FGL6_4_ 

336879 OB2_4157FGL43_7_ 

338255 CH22_6356FG_UNieai^C00 

338260 CH2a.686a=Q_MNK.EM^C00 
40 329929 C16_D2 

329960 c16_p2 

338561 CH22_7294FQ_UN>eEI^AC00 

338562 CH22_7295FQ_UNHLEM^C00 
Arm ' 338759 (m.7581FCL_MNK-EMAC00 
45 338763 CH22^7585FGUJNKJEM^C00 

338764 C^7586FCLJJNieEMAC00 

333168 CK2e_400FGLWJ.UNieEMA 

333169 CH22 401 FGJWJLUNK_EM:A 
333452 CH22L702FGL157_1JJNK_EM: 

50 333456 CH22_706FGL157_5_UN}eEM: 
r 333458 CH22_708FGL157_7_UNK_EM: 

333611 0^872FQJ17.CUJNKJM: 

333821 CH2^882FO_219.5JJNKJM: 

333814 CH22_1082=aj82^JJNK L .EM 
55 ' 333849 CH22_1118FGL290JBjJNieEM 

335179 CH22_2515FCL504_9_UNK.EM 

333949 CH2a.1225FGL3033JJNlCEM 

333951 CH2^.1227FQ.303JJJNK.EM 

333955 CH22J231FGLJ3Q3J 1JJNK_E 
60 335293 CH22^635F<L527j3JJNK_EM 

326818 c20JlS 

326997 C21 hs 

335550 CH^_2905i=CL576.11JJhBCE 
335581 CH22^938FGU581_19JJNrCf 
65 335586 CH2W944FGL581 JSJJNKJE 
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328492 <l7Jis 

335809 CH22_3181f=Q_617_6_L(NK_EM 

335810 CH2a.3182FGL617JL.UNK.EM 
335822 CH22_3195FG_619.7.UNK_EM 

5 335824 CH2a_3197FCL619J 1_UNK_E 
335853 CH22_3228FG_626_5_UNK^EM 
335888 CH22_3261RL632 - 4JJM!eEM 
330020 c16_p2 
330211 CL5_p2 
10 337577 CH22_5864FCUJNieC65E1.Q 
307848 AJ364185 

332737 CH22_13FGU6^JJNieC4Q1.Q 

332798 CH22J4FG_6_5_UMK_C4G1 .Q 

332799 CH22J5FGL6_6_.UNK.C4G1 .Q 
15 334150 CH22.1429FGL339_1_UNIC.EM 

332933 CH22_154FG_38_7_UNK_C20H 
332980 CH22_204FGL54JJJNK,EMA 
332984 CH22_208FG.54_6JJNK.EMiA 
334223 CH22L1507FQ_360_4_UNK_EM 
20 334297 Oe2_158BFGL372_3.UNK.EM 
327098 C21_hs 

334443 CH22_1742FQ_387^UNK.EM 

334444 CH22J743FG.387jUJNK.EM 
334447 CH22.1746FCL387_7_UNK.EM 

25 334570 CH22.1875FGL405.11JJNKJE 
334749 CH22_2061FGL427.1_UNK_EM 
334777 CH22_2089FG_430J9_UNK.EM 
336034 CH22.3419FC3.678_5.UNK.DJ 
334960 CH22_2281FGt465j29JJNK_E 
30 336441 CH22L3881FG.827.7.UNK_OJ 

330551 9851.2 U39840 NM.004496 AW135607BE087458BE087567M177116 AW195705AW750756AI811008 AI694151 

BE348594 AW971075 A1347950 AI201455 AI073898 AA652680 AA613671 AI31B364AA507S50 AA693692 
AI032599 AA991871 AI269801 AW948974 774639 AA532907 AW949173 
330786 53973.3 BE379594 AI192455 AL039862 AI744012 AI761735 AW2431B1 AI7436B7 AI928223 AI423022 AI627655 

35 AI636059A1651571 AW802044 AI826995 AI431733 A15391 25 AAB63056 AW270910 AI768930 AW008835 

AW615183 AW591147 A1695294 AI672106 AA506358 AI308060 AA011556 AA962437 AI935488 BE219625 
AHXM356 AW151394 A1218466 N86178 AI4197B4 AW242519 AW946907 D60374 AA989263 AI698799 
AA470460AI824167 

332247 S72969 1 AA669097 AA513815 AAQ26798 AA676526 AA704429 AA704289 AW1 18292 AA579216 N58172 

40 332396 20265 1 AW579842 BE15S562 BE156690 BE156489 BE081033 AK001559 BE1 49402 M85387 AW367811 AW3677B8 

R17370 AI908947 AA382932 R58449 H1 8732 AA371231 AW962899 AA713530 AW832946 R53463 H11063 
AW068542 Z40761 BE176212 BE176155 W23952 W92188 AW374883 AA303497 AW954769 AA036808 
BE1 68063 AW332073 AW3820S5 AL041475 H80748 AID7B161 BE463383 AJ 8052 13 AI761264 W94885 
N94502 A1623772 AI419532 AI810302 AI634190 AW002516 AW150777 AI352312 AI367474 AW204807 
45 A18755Q2 AI337026 AW134715 BE328451 A! 123157 A1560020 AB00745 A1608631 Ai248873 AA742484 

AW051635 H18646 AI245045 AA5071 11 AI640510 AI925594 AA115747 AA143035 AA151106 
332781 32044 1 AK001764 BE313896 AA380199 AA380151 AA194996 AW1 18089 AA495871 AW975219 AW085598 

AB789Q9 AW992310AW992409 AI911857AA657643AW04471 AI2425B9 AI623968 R09556 A1129100 
AI206500 AA680094 AA677784 AI023178 A1277519 AA424742 AI240654 AA232846 AI804273 A1382376 
50 AA001729 W90790 BE090656 AW295015 AI674596 A1431734 AW20517 AW769185 AJ128355 AI192474 

AI820001 AA001929 AA706925 AI076676 A1499119 AI200493 A1695919 AI376217 W69195 W69261 
AW305099 W90320 BE048357 AI658856 AA838534 AA233258 AI753393 AA709227 AK74387 AJ87261 6 
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TABLE 3B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 3. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed 

Pkey: Unique number corresponding to an Eos probeset 

Ret: Sequence source. The 7 dig& numbers in this column are Qenbank Identifier (GI) numbers. "Dunham I. et si" refers to the 

pubEcafion entitled The DNA sequence of human chromosome 22." Dunham I. et aL, Nature (1999) 402:469-495. 
Strand: Indicates DNA strand torn which axons were predicted. 

NLpostiiQn: Indicates nucleotide positions of predicted exons. 



Pkey Ref 



Strand NLposItbn 



333611 
333621 
333814 



334150 
334297 
334443 
334444 
334447 
334570 
334777 
335179 
335581 



335810 



336034 
336441 
337577 



332797 
332798 
332799 



3 3 29 8 4 



333169 



333456 
333458 
334223 
334749 



336679 



Dunham, I. etai 
Dunham, I. eta). 
Dunham, I. etaL 
Dunham, I. etaL 
Dunham* I. etai 
Dunham J. etaL 
Dunham, I. etai 
Dunham, t. etaL 
Dunham, L etai, 
Dunham, L etaL 
Dunham, I. etaL 
Dunham, L etaL 
Dunham, I. etaL 
Dunham, I. etai. 
Dunham, L etaL 
Dunham, I. etai. 
Dunham, LetaL 
Dunham, L etaL 
Dunham, I. etaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, I. etaL 
Dunham, LetaL 
Dunham, I. etaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, t etai. 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, L etaL 
Dunham, L etaL 
Dunham, LetaL 
Dunham, I. etaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, L etaL 
Dunham, LetaL 
Dunham, I. etaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, L etai 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 



Plus 

Pius 

Plus 

Plus 

Plus 

Pais 

Phis 

Phis 

Phis 

Phis 

Phis 

Phis 

Phis 

Plus 

Plus 

Phis 

Phis 

Phis 

Plus 

Pius 

Pius 

Pius 

Phis 

Plus 

Plus 

Pius 

Mmus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Minus 

Mnus 

Minus 

Minus 



6548368-6548507 
8597414-8597560 
7894165-7894252 
8018323-8018472 
8589634-8589791 



359741*8597560 

10529221-10529854 

13420934-13421058 

1429898M4299056 

14306433-14306492 

1430876*14308824 

14994868-14994943 

16259586-16260166 

21634405-21634526 

24976198-24976334 



26310772-26310909 
26314767-28314849 
26364087-26364196 



29014404-29014590 

34187606-34187663 

595377-595678 

15458919-15459257 

216964-216798 

232147-231974 

232421-232307 

2035790-2035681 

5136165-5136019 



3729896-3729788 

3730864-3730767 

5136165-5136019 

26319332631797 

5143942-5143806 

12734365-12734269 

16090686-16090106 

20160968-20160795 

22316403-22316275 

24668714-24668658 

26614629-26614506 

227714-227577 

229124-229024 

2035790-2035681 

15242294-15242231 

22311966-22311856 

22312594-22312465 

26582475-26582199 



26641232-26641101 
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329980 5091594 

329929 6165201 

330020 6671887 

326816 6552458 

326997 5867660 

327098 6682516 

330211 6013592 

328492 5868455 

329362 5868837 



Minus 103M162 

Minus 156410-156553 

Plus 172397-172491 

Phis 198354-198436 

Minus 71389-72147 

Minus 1061684-1062361 

Plus 59158-59215 

Minus 46094-46241 

Minus 65688-68173 
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TABLE 4: shows a preferred subset of the Accession numbers for genes found in Table 3 
which are differentially expressed in prostate tumor tissue compared to normal prostate 
tissue. 



Ptey: Unique Eos probeset Identifier number 

ExAccm Exemplar Accession number, Genbank accession number 

UnfceneD: Unjgene number 

Unlgene Title: Unjgene gene tffie 

R1: Rafio of tumor to normal body tissue 



Ptey 


ExAcen 


UnigenelD Unlgene Title 


R1 


100819 


HO40204fT4290H&2387 Transgrutaminase 


105 


102698 


U75272 


Hs.1867 progastrtesin (pepsinogen C) 


103 


102869 


XQ2544 


H&572 orosomucoHl 


223 


105370 


AA236476 


H&22791 ESTs; Weakly similar to transmembrane pr 


103 


106645 


AA282138 


Hs.11325 ESTs 


14 


106094 


AM19461 


Hs33317 ESTs 


109 


109014 


AA1B6790 


H&262036 ESTs 


153 


109562 


F01R11 


Hs.187931 ESTs; Moderately sinto to vottage-gate 


103 


113021 


T23855 


Hs.129836 WAA1028 protein 


103 


114124 


238595 


Hs.125019 ESTs; Highly similar to K1AA0886 protein 


213 


122791 




Hs.129836 WAA1028 protein 


124 


124352 


N21626 


Hs.102406 ESTs 


103 


301042 


A1659131 


Hs.197733 ESTs 


243 


302005 


A1869666 


Hs.123119 ESTs 


363 


302410 


NM 004917 


Hs318366 EST cluster (not In UnlGene) with exonh 


263 


302881 


AA508353 


Hs.105314 retell (HI) 


783 


303344 


AA255977 


H&250646 ESTs; Highly similar to uWqulfln-conjug 


193 


303753 


AW503733 


H&9414 ESTs 


13 


310431 


AI42Q227 


Hs.149358 ESTs 


723 


311251 


AI655662 


HS.197698 ESTs 


413 


311596 


AJ682088 


H&79375 ESTs 


264 


312153 


AA759250 


H&118625 cytochrome b-561 


11 


312521 


AA033609 


H&239884 ESTs 


113 


313676 


AA861697 


Hs.120591 EST cluster (not In UniGene) 


13.4 


314171 


AI821895 


Hs.193481 ESTs 


294 


314907 


AI672225 


H&222886 ESTs 


193 


315051 


AW292425 


Hs.163484 EST 


153 


315052 


AA876910 


Hs.134427 ESTs 


20 


317548 


AI654187 


Hs.195704 ESTs 


143 


317869 


AW295184 


Hs.129142 ESTs; Weakly similar to DEOXYRiBONUCLEAS 13-8 


316428 


AI949409 


Hs.194591 ESTs 


123 


318524 


AVU291511 


Hs.159066 ESTs 


253 


319080 


Z45131 


H&23023 ESTs 


163 


319763 


AA460775 


Hs3295 ESTs 


143 


320324 


AP071202 


Hs.139336 ATP-Wnding cassette; sub-lamBy C (CFTR 


563 


321441 


AW297633 


Hs.1 18438 ESTs 


14.7 




W07459 


Hs.1 57601 EST duster (not in UniQene) 


22 


322782 


AA056060 


H&2Q2577 EST duster (not in UniQene) 


184 


322818 


AW043782 


Hs393616 ESTs 


10J 


323287 


AA639902 


Hs.104215 ESTs 


24.7 


324603 


AW016378 


H&292934 ESTs 


243 


324617 


AA508552 


Hs.195839 ESTs 


54 


324658 


AI694767 


H&129179 ESTs 


22 


324691 


A1217963 


H&293341 ESTs; Weakly similar to Pro-a2(Xf) [H.sa 


103 


324698 


AA641092 


Hs357339 ESTs 


103 


324718 


AI557019 


Hs.1 16467 ESTs 


344 


330211 




CRQ5_p2gi]8013592 


123 


330430 


HG2261-HT2352 Hs321 110 Arrfigen, Prostate Specific, Aft. Splice 


133 


330706 


AA1 21 140 


Hs.177576 ESTs; Moderately similar to kynurenlne a 


143 


330762 


AA449677 


Hs.15251 Human DNA sequence from done 437M21 


on 183 


330892 


AA149579 


HS-91202 ESTs 


153 


330949 


H01458 


Hs.1 42896 ESTs 


103 
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331093 R36671 Hs. 14846 ESTs 113 

331151 R82331 K&268838 ESTs 13 

331889 AA431407 Hs.98802 Homo sapiens Chromosome 16 BAC dona CIT 33.6 

332247 N58172 ESTs 143 

5 332398 AA34Q504 ESTs; Weakly similar to simflarto human 212 

332533 M99467 Hs325825 folate hydrolase (prostate-specific memb 38.1 

332697 T94885 Hs.75725 carfcoxypeptkiase E 243 

332797 C*fcaj£BJES.B_2 303 

332798 CH22_FGENES3_5 663 
10 332799 CH23J=GENES3_6 193 

334223 CH22LFQENES360J 203 

336624 CH22.FGENES.6-3 433 

336625 CH22J=GENES34 373 



139 



WO 02/30268 



PCT/US01/32045 



TABLE 4A shows the accession numbers for those primekeys lacking unigenelD's for Table 
4. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
5 and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). Hie Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 

10 Pkey: Unique Eos probeset identifier number 

CAT number: Gene duster number 

Accession: Genbank accession numbers 

IS Pkey CAT number Accession 

336624 CH22_4071FG_6_3_ 

336625 CH22L4072FG_6jL 
330211 c 5_p2 

20 332797 CH2^13FGJBJJJNKJC4G1.G 

332798 CH2^14FGL6_5JJNieC4G1.G 

332799 CH22L15FG_6_6JJNK_C4G1.G 
334223 CH22J507FG_360_4JJNK_EM 

332247 372969J AA669097 AA513815 AA026798 AA576526 AA704429 AA704269 AW1 18292 AA579216 N58172 

25 332396 20265 J AW579842 BE156562 BE 156690 BE156489 BE081033 AK001559 B E1 49402 M85387AW367B11 

AW367798 R17370 AI908947 AA382932 R58449 H18732 AA371231 AW962899 AA713530 AWB92946 
R53463 H11063 AW068542 Z40761 BE176212 BE176155 W23952 W921B8 AW374883 AA303497 
AW954769 AA036808 BE168063 AW3B2073 AW382085 AL041475 H80748 AI078161 BE463983 
A1805213 AI761264 W94885 N945Q2 AI623772 AI419532 AI810302 AI634190 AW002516 AW150777 
30 AI352312 AI367474 AW204807 AJ675502 AI337Q26 AW134715 BE328451 AI123157 A1560020 

A1300745 AI608631 AI248873 AA742484 AW051635 H18646 AI24S045AA507111 AI640510AI925594 
AA115747AA143035AA151106 
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TABLE 4B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 4. For each predicted exon, we have listed the genomic sequence 
5 source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Ptey: Unique number conesponding to an Eos probeset 

Ref: Sequence source. The 7 diga numbers In this cofumn are Genbank Identifier (GI) numbers, "Dunham LetaT refers to Ihe publication entiOed "The 

10 DNA sequence of human chromosome 22.' Dunham LelaL, Nature (1899) 402:489495. 

Strand: Indicates DNA strand from which axons were predicted. 

Imposition: Indicates nucleotide posfttons of predicted axons. 



Pkay 


Ref 


Strand 


Ntposiflon 


332797 


Dunham, LataL 


Minus 


216964-216796 


332798 


Dunham, Lata! 


Minus 


232147-231974 


332799 


Dunham, I.etaL 


Minus 


232421-232307 


334223 


Dunham, 1. eLei 


Minus 


12734365-12734269 


336624 


Dunham, LelaL 


Minus 


227714-227577 


336625 


Dunham, Letat 


Minus 


229124-229024 


330211 


6013592 


Plus 


59158-59215 
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TABLE 5: 1170 GENES UP-REGULATED IN PROSTATE CANCER COMPARED TO 

NORMAL ADULT TISSUES 

Table 5 shows 1170 genes up-regulated in prostate cancer compared to normal adult tissues. 
These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip array such 
that the ratio of "average" prostate cancer to "average" normal adult tissues was greater than 
or equal to 3.44. The "average" prostate cancer level was set to the 85* percentile amongst 73 
prostate cancers. The "average" normal adult tissue level was set to the 85 th percentile 
amongst 162 non-malignant tissues. In order to remove gene-specific background levels of 
non-specific hybridization, the 7.5 th percentile value amongst the 162 non-malignant tissues 
was subtracted from both the numerator and the denominator before the ratio was evaluated. 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Pkey: 




Unique Eos probeset identifier number 




ExAccru 




Exemplar Accession number, Genbank accession number 




UnlgenelD: 




Unigene number 






Unigene TBIe: 


Unigene gene title 




R1: 




Ratio of tumor to normal tissue 




Ptey 


ExAccn 


Untgenett) 


Unigene Title 


R1 


446057 


AI420227 


H&149358 


ESTs, Weakly similar to A4601 0 X-Onked 


8642 


400302 


N48056 


Hs.1915 


folate hydrolase (prostate-specific memb 


66.46 


414569 


AF109298 


Hs.1 18258 


prostate cancer associated protein 1 


58.36 


417407 


AA923278 


H&290905 


ESTs, Weakly similar to protease [H^apl 


56.16 


431579 


AW971082 


H&222888 


ESTs, WeaWy similar to TRHYJtiJMAN TRICH 


53.38 


409361 


NML005982 Hs£4416 


sine ocu&s homeobox (DrosophUa) homolo 


4828 


409731 


AA125985 


H&56145 


thymosin, beta, identified h neuroblast 


4524 


400298 


AA032279 


Hs.61635 


six transmembrane epithelial antigen of 


43.48 


420154 


AI093155 


Hs.95420 


JM27 protein 


41.12 


433466 


AA508353 


Hs.105314 


relaxin 1 (HI) 


39.88 


400293 


AA305627 


Hs.139336 


ATP-b&idlng cassette, sub-family C {CFTR 


3842 


400292 


AA250737 


Hs.72472 


ESTs 


38.00 


432887 


AI926047 


Hs.1 62859 


ESTs 


36.48 


439176 


A1446444 


Hs.190394 


ESTs, WeaWy simflar to B28096 fine-1 pr 


36.45 


430722 


AW968543 


H&203270 


ESTs, Weakly sbnter to ALU1 JIUMAN ALU S 


3120 


437052 


AA861697 


HS.120591 


ESTs 


33.02 


418398 


AI765805 


H&26691 


ESTs 


32X8 


434038 


A(659131 


Hs.197733 


hypothetical protein MGC2849 


3244 


407709 


AA456135 


HS23023 


ESTs 


32.10 


426747 


AA535210 


Hs.171995 


kalKkrem3, (prostate specific antigen 


3130 


407168 


R45175 




ESTs 


31.72 


440260 


A1972887 


Hs.7130 


copfetetV 


3052 


421513 


X00949 


Hs.105314 


relaxm1{H1) 


30.10 


416370 


N90470 


H&203697 


ESTs, Weakly similar to I38022 hypothefi 


2938 


407122 


H20276 


HS31742 


ESTs 


2924 


400287 


S39329 


Hs.181350 


kainkreln 2, prostatic 


28 JO 


432244 


Al 669973 


H$20Q574 


ESTs 


28.74 


451939 


U80456 


H&27311 


single-minded (DrosophUa) homolog 2 


28.74 


416989 


AI267700 


Hs.111128 


ESTs 


2834 


418961 


AW967646 


H&23Q23 


ESTs 


2754 


425628 


NMJ004476 Hs.1915 


folate hydrolase (prostate-specific memb 


27-32 


458509 


AA654650 


H&282906 


ESTs 


2724 


448290 


AK002107 


Hs20843 


Homo sapiens cDNA FU11245 lis, done a 


27.16 


428336 


AA503115 


Hs.183752 


microseminoprotein, beta- 


26.17 


450096 


AI682088 


HS223368 


holocarboxyiase synthetase (btotin-[prop 


25.60 


400299 


X07730 


Hs.171995 


kaJScretn 3, (prostate specific antigen 


24J91 


437571 


AA760894 


Ha.153023 


ESTs 


24.74 


453160 


AI263307 


Hs.146228 


H2B hfetone famify, member L 


2436 


453096 


AW294631 


Hs.11325 


ESTs 


2446 


425075 


AA506324 


Hs.1852 


acid phosphatase, prostate 


2423 


407202 


N58172 


Hs.109370 


ESTs 


24.18 
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424846 AU077324 Hs.1832 neuropepfideY 2357 

453370 AI470523 Hs. 182356 ATP-binding cassette, subfamily C (CFTR 23.16 

422805 AA438989 Hs.121017 H2Amstone family, member A 2252 

444917 R68651 Hs.144997 ESTs 2226 

5 408826 AF216077 H&46376 Homo sapiens clone HB-2 mRNA sequence 22.02 

413597 AW3Q2885 Hs.117183 ESTs 21.76 

426429 X73114 Hs.169849 myosin-binding protein C, slow-type 2152 

435981 H74319 Hs.188620 ESTs 21.12 

432866 AA650114 ESTs 2157 

10 418848 AI820961 Hs.193465 ESTs 21D6 

405685 2050 

443271 BE568568 H&195704 ESTs 1958 

418819 AA228776 Hs.191721 ESTs 1954 

420757 X78592 Hs59915 androgen receptor (dihydrotestosterone r 19.72 

IS 418994 AA296520 H&89546 setecfin E (endothelial adhesion moJecul 1956 

429918 AW873986 Hs.119383 ESTs 19.04 

415539 AI733881 Hs.72472 ESTs 1843 

450382 AA397658 H&60257 Homo sapiens cONA RJ 13598 Gs, clone PL 1834 

418828 AA516531 H&55999 NK homeobox (Drosophfla), famBy 3, A 1828 

20 429984 AL050102 Hs227209 hypothetical protein FU21 617 1732 

443822 AKJ87412 Hs.143611 ESTs, WeaWy similar to 2004399A chromos 17.66 

431678 AI685464 H$29263B gb^tS8f04jc1 NCLCQAP_Pr28 Homo sapiens 17.64 

410330 AW023630 H&46786 ESTs 1752 

432441 AW292425 Hs.163484 ESTs 1741 

25 452792 AB037765 H&30652 K1AA1344 protein 1759 

445472 AB006631 Hs.12784 Homo sapiens mRNA for KIAAQ293 gene, par 17.00 

414565 AA502972 Hs.183390 hypothetical protein FU13590 1652 

430487 D87742 Hs241552 WAA0268 protein 16.72 

431716 D89053 Hs268012 fatty^Coenzyme A Bgase, long-chain 1650 

30 4.19536 AA603305 gb:np12d11.s1 NCLCGAP_Pr3 Homo sapiens 1650 

439677 R82331 Hs. 164599 ESTs 1646 

449625 NMJ514253 Hs23786 odz (odd Oz/ten-m. Drosophda) homoiog 1 1652 

408430 S79876 Hs.44926 dtpeptidytpeptidase IV (CD26, adenosine 1628 

447033 AI357412 Hs.157601 ESTs 16U>2 

35 453006 A1362575 Hs.167133 ESTs 15.74 

431474 AL133990 Hs.190642 ESTs 15.70 

420218 AW958037 Hs22437 ribosomal protein L4 1554 

408000 L1 1690 Hs.620 bullous pemphigoid antigen 1 (230V240kD) 1554 

416208 AW291168 Hs.41295 ESTs, Weakly similar to MUC2_HUMAN MUCIN 1548 

40 430226 BE245562 H&2551 adrenergic, befa-2-, receptor, surface 1540 

415263 AA948033 Hs.130853 ESTs 1558 

432437 W07088 Hs293685 ESTs 1526 

428398 A1249368 H&98558 ESTs 1521 

429900 AA460421 Hs50875 ESTs 1450 

45 449156 AF103907 Hs.171353 prostate cancer antigen 3 1459 

411096 U80034 Hs58583 mitochondria! Intermediate peptidase 1451 

435974. U29690 Hs57744 Homo sapiens beta-1 adrenergic receptor 14.76 

444484 AK002126 Hs.11260 hypothetical protein RJ1 1254 14.76 

422728 AW937826 Hs.103262 ESTs, Weakly similar to ZN91.HUMAN ZINC 1450 

50 418601 AA279490 Hs.86368 calmegln 1456 

448999 AF1 79274 Hs22791 transmembrane protein with EGRke and 1455 

445885 AI734009 Hs.127699 K1AA1603 protein 1444 

452712 AW838616 gb«C5lT0054-14020(H)13-D01 LT0054 Homo" 1422 

432189 AA527941 gbmh30c04.s1 NCLCGAP J>r3 Homo sapiens 14.12 

55 424565 AW102723 Hs.75295 guanytate cyclase 1 , soluble, alpha 3 13.78 

429290 AF203032 Hs.198760 neurofilament heavy polypeptide (200kD) 1357 

419264 AA877104 Hs293672 ESTs, Weakly similar to ALUB_HUMAN illl 1340 

416445 AL043004 Hs500678 K1AA01 35 protein 1352 

407275 AI364186 gb:qw34h07jc1 NCI.CGAP^UW Homo sapiens 1324 

60 408369 R38438 Hs.182575 soUite carrier family 15 (H^epfidetra 1321 

446720 A1439136 Hs.140546 ESTs 1356 

434988 A1418055 Hs.161160 ESTs 1352 

448172 N75278 Hs.135904 ESTs 1258 

416182 NM_004354 Ks.79069 cycGnQ2 1254 

65 420544 AA677577 Hs.98732 Homo sapiens Chromosome 16 BAC done CTT 12.79 

445413 AA151342 Hs.12677 CGM47 protein 1254 

452588 AA889120 Hs.110637 homeoboxAlO 1252 

407819 R42185 Hs274803 ESTs 1250 

433444 AW975324 Hs.129816 ESTs 1250 
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421059 AI654133 H&S0212 thyroid receptor interacting protein 15 1230 

420077 AW512260 Hs37767 ESTs 1224 

453930 AA419466 H&36727 hypothetical protein FU 10903 1222 

441610 AW576148 Hs.148376 ESTs 1220 

5 451009 AA013140 Hs.1 15707 ESTs 12.18 

433764 AW753676 H&39982 ESTs 12.16 

440266 U29589 Hs.7138 cholinergic receptor, muscarinic 3 12.04 

443912 R37257 Hs.1 84780 ESTs 11 £2 

419526 A1821895 Hs.1 93481 ESTs 11-91 

10 423073 BE252922 Hs.123119 MAD (mothers against decapentaptegfe, Dr 1UB7 

452784 BE463857 Hs.1 51258 hypothetical protein RJ2 1062 1U66 

414422 AA147224 Hs.71814 ESTs 11.76 

450203 AF097994 H&301528 Ucynurenlr^afcha^mirioadrpatB aminotm 11 38 

436679 AI127483 Hs.120451 ESTs, Weakly similar to unnamed prote&i 11-60 

15 440901 AA909358 H&128612 ESTs 1130 

448045 AJ297438 H&20166 prostata stBm oeP antigen 11-51 

433887 AW204232 Hs279522 ESTs 11J50 

434980 AW770553 Hs283640 sterol O-acytaisferase (acyVCoenzyme 1138 

425905 AB032959 Hs.1 61 700 novel C3HC4 type Zinc finger (ring tinge 1133 

20 434680 T11738 Hs.127574 ESTs 1132 

449650 AF055575 Hs297647 calcium channel, voltage-dependent, L ty 11.18 

431173 AW971198 Hs294068 ESTs 11.18 

434539 AW748078 Hs214410 ESTs, Weakly simBar to MUC2_HUMAN MUCIN 11.16 

410037 AB020725 HS38009 WAA0918 protein 11.14 

25 417708 N74392 H&50495 ESTs 11.14 

458332 AI000341 H&220491 ESTs 11.12 

420381 D50640 Hs301782 phosphodiesterase 38, cGMP-inhMed 11.10 

425665 AK001050 Hs.159066 hypothetical protein FU10188 1138 

425710 AF030880 Hs.159275 sciute carrier family, member 4 11J08 

30 428728 NM.016625 Hs.191381 hypoftetlcal protein 1134 

407021 U52077 gb:Human marinerl transposase gene.comp 11J02 

410733 D84284 Hs.65052 CD38 antigen (p45) 11X12 

401714 1030 

434485 AI623511 Hs.118567 ESTs 1039 

35 415786 AW419196 H&257924 hypothetical protein FU13782 1037 

452340 NM_0022Q2 H&505 ISL1 transcription factor, UM/homeodoma 1035 

453628 AW243307 Hs.170187 hypothetical protein 10.72 

408063 BE088548 Hs.42346 csJcmeunn-blnding protein caJsardrM 1037 

417687 AI828596 H&250691 ESTs 1034 

40 434666 AF151103 Hs.112259 Toefl receptor gajnmatocus 1033 

432374 W68815 Hs301885 Homo sapiens cONA FU1 1346 fis, done PL 1030 

428819 AL135623 Hs.193914 WAA0575 gene product 1048 

413409 AI638418 Hs21745 OEAD/H (Asr>Glu-Ala-Asp/His) box potypep 1044 

428775 AA434579 Hs.143691 ESTs 1021 

45 . 436556 AI364997 Hs.7572 ESTs 1020 

441690 R81733 H&33108 ESTs 10.14 

419852 AW503756 Hs288184 hypothetical protein CU551D23 10.10 . 

421991 NM.014918 Hs.1 10488 K1AA0990 protein 1034 

423698 AA329796 Hs.1098 DKFZp434J1813 protein 1032 

50 452039 A1922088 Hs.172510 ESTs 1030 

433043 W57554 Hs.125019 ESTs 038 

433927 A1557019 Hs.1 16467 small nuclear protein PRAC 937 

445424 AB02B945 Hs.12696 cortactin SH3 domain-binding protein * 938 

432240 AI694767 Hs.129179 Homo sapiens cDNA FU13581 fis, clone PL 938 

55 433104 AL043002 Hs.128246 ESTs, Moderately similar to unnamed prat 934 

452744 AI267652 Hs30504 Homo sapiens mRNA; cONA DKFZp434E082 (fr 932 

431217 NWL013427 Hs260830 Rho GTPase activating protein 6 9.75 

427398 AW390Q20 Hs20415 chromosome 21 open reading frame 1 1 9.70 

446896 T15767 Hs22452 Homo sapiens mRNA for KIAA1737 protein, 9.70 

60 421470 R27496 Hs.1 378 annexfnA3 934 

406554 930 

401424 9-58 

407902 AL117474 Hs41181 Homo sapiens mRNA; cONA DKFZp727C191 (fr 936 

423545 AP000692 Hs.1 29781 chromosome 21 open reading frame 5 934 

65 439024 R96696 H&35598 ESTs 931 

431548 AI834273 H&9711 novel protein 948 

409262 AK000631 Hs.52256 hypothetical protein RJ20624 945 

446271 D82484 Ks.100469 ESTs 9.42 

448692 AW013907 Hs224276 rrotrryk^tofwyK^oer^yme A carboxylase 2 926 

144 



WO 02/30268 



PCT/US01/32045 





414140 


AA281279 


HS23317 


hypothetical protein FU 14681 


824 




435980 


AF274571 


Hs.129142 


deoxyribonuclaase 0 beta 


9.24 




421246 


AW582962 


Ha300961 


OGM7 protein 


920 




427304 


AA761526 


Hs.163853 


ESTs 


9.16 


5 


442914 


AW188551 


Hs39519 


hypothefcal protein FU 14007 


9.16 




413627 


6E162082 


Hs246973 


ESTs 


9.14 




439699 


AF08S534 


Ks.187561 


ESTs, Moderate ty similar to ALU1 J-dfMAN A 


9.10 




437718 


A1927288 


Hs.196779 


ESTs 


9.07 


10 


439820 


AL360204 


H&283853 


Homo sapiens mRMA fufl length frisertcDN 


9.06 


447342 


A1199268 


Hs.18322 


Homo sapiens, Similar to RIKEN cDNA 2010 


9.05 




446223 


BE300091 


Hs.1 18699 


hypothetical protein RJ12969 


9.04 




410001 


AB041036 


Hs37771 


kainkreln11 


9.03 




424012 


AW368377 


Hs.137569 


tumor protein 63 kDa with strong homotog 


9.03 


15 


441791 


AW372449 


Hs.175982 


hypothetical protein FU21 159 


9.02 


448206 


BBS22585 


Hs.3731 


ESTs, Moderately similar to 138022 hypct 


9.02 




414269 


AA298489 




olfactory receptor, family 51, subfamily 


839 




442081 


AA401863 


Hs22380 


ESTs 


8.98 




420092 


AA814043 


H&68045 


ESTs 


835 


20 


411630 


U42349 


Hs.71119 


Putative prostate cancer tumor suppresso 


8.80 


421863 


AI952677 


Hs.108972 


Homo sapiens mRNA; cONA DKFZp434P228 (fr 


830 




454141 


AW138413 


Hs.1 82356 


ATP-blnding cassette, sub-family C (CFTR 


830 




418278 


AI088489 


H&83937 


hypothetical protein 


8.78 




428330 


L22524 


H&2256 


matrix metaDoprotelnase 7 (matrflystn, 


8.76 


25 


432415 


T16971 


H&269014 


ESTs, Weakry similar to A43932 mucin 2 p 


8.75 


424906 


AJ566088 


Hs.153716 


Homo sapiens mRNA for Hmob33 protein, 3* 


8.74 




415245 


N59650 


H&27252 


ESTs 


8.72 




442409 


BE208843 


Hs.129544 


hypothetical protein MQC15438 


8.70 




404571 








8.66 


30 


418033 


W68180 


Hs.259855 


elongation factor*2 kinase 


834 


456497 


AW967956 


Hs.123648 


ESTs, WeaWy similar to AF108460 1 ublnu 


836 




405876 








834 




448807 


A1571940 


Hs.7549 


ESTs 


832 




445372 


N36417 


Hs.144928 


ESTs 


8.48 


35 


425171 


AW732240 


H&300615 


ESTs 


BM 


419968 


X04430 


HSJ93913 


interteuWn 6 (interferon, beta 2) 


838 




407385 


AA610150 


Hs272072 


ESTs, WeaWy similar to 138022 hypotheti 


831 




433172 


AB037841 


Hs.102652 


hypothetical protein ASH1 


830 




422631 


8EZ18919 


Hs.1 18793 


hypothetical protein FU10688 


827 


40 


412719 


AW016610 


Hs.129911 


ESTs 


824 


418849 


AW474547 


Hs33565 


Homo sapiens PIG-M mRNA for mannosyttran 


822 




444922 


A1921750 


Hs.144871 


Homo sapiens cONA RJ13752 Ms, clone PL 


822 




427674 


KM.003528 


Hs2178 


H2B Wstone family, member Q 


820 




432101 


A1918950 


Hs.1 1092 


EohA3 


8.17 


45 


416288 


H51299 




gb.~yp07c06.s1 Scares breast 3NbHBst Homo 


8.15 


404915 








8.08 




440106 


AA864968 


Hs.127699 


K1AA1603 protein 


8.07 




442861 


AA243837 


Hs37787 


ESTs 


ao6 




452259 


AA317439 


Hs28707 


signal sequence receptor, gamma (transto 


aoe 


50 


443250 


AI041530 


Hs.132107 


ESTs 


8.06 


437267 


AW511443 


H&258110 


ESTs 


8,04 




452891 


N75582 


Hs212875 


ESTs, WeaWy similar to DYH9_HUMAN COJ 


8,02 




422219 


AW978073 




regulator of mitotic spindle assembly 1 


830 




453049 


BB37217 


Hs3Q343 


ESTs 


8.00 


55 


439731 


AI953135 


H&45140 


hypothetical protein RJ 14084 


738 


408554 


AA836381 


Hs.7323 


nuclear receptor co-repressoi/HDAC3 comp 


734 




421154 


AA284333 


Hs287631 


Homo sapiens cDNA RJ 14269 fis, done PL 


734 




430107 


AA465283 


Hs.105069 


ESTs 


734 




433404 


T32982 


Hs.102720 


ESTs 


7.93 


60 


450813 


AI739625 


Hs2Q3376 


ESTs 


730 


416239 


AL038450 


H&48948 


ESTs 


735 




448212 


A1475858 




gfctc87d07x1 NCLCGAP_CLL1 Homo sapiens 


732 




449532 


W74653 


Hs271593 


ESTs, Moderately slmSar to A47582 B-cel 


732 




413930 


M86153 


Hs.75618 


RAB11A, member RAS oncogene fam3y 


730 


65 


458191 


A1420611 


Hs.127832 


ESTs 


730 


444858 


AM 99738 


Hs208275 


ESTs, Weakly similar to ALUA_HUMAN Oil 


7.78 




457488 


AI732230 


Hs.191737 


ESTs 


7.78 




407235 


D20569 


Hs.169407 


SAC2 (suppressor of actin mutations 2, y 


7.76 




433759 


AA660003 


Hs.109363 


Homo sapiens cDNA: RJ23603 fis, clone L 


7.74 




433805 


AA706910 


Hs.112742 


ESTs 


7.74 
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426485 


NMJJ06207 


Hs.170040 


pfetetet-derived growth factor receptor- 


7.72 




446028 


R44714 


Hs.106795 


Homo sapiens cDNA FU13136 fis, clone NT 


7.72 




418555 


AI417215 


Hs.67159 


hypothetical protein FU12577 


7.70 




447489 


AW262580 


Hs.147674 


protocadharinbeta16 


7.70 


5 


419839 


U24577 


Hs53304 


pnosphoCpase A2, group VII (platetetac 


7.68 




416857 


AA188775 


Hs292453 


ESTs 


758 




413801 


M 62246 


Hs55406 


ESTs, Highly similar to unnamed protein 


756 




425480 


AB023198 


Hs.158135 


K1AA0981 protein 


7.66 


10 


420120 


AL049610 


Hs55243 


transcription elongation factor A (SII)- 


7.64 


424099 


AF071202 


Hs.139336 


ATP-binding cassette, sub-family C (CFTR 


7.64 




446307 


T50083 


Hs5094 


ESTs 


7£3 




429220 


AW207206 


Hs.136319 


ESTs 


759 




420345 


AW295230 


Hs.25231 


ESTs 


754 


15 


429208 


AA447990 


Hs.190478 


ESTs 


754 


447247 


AW369351 


Hs287955 


Homo sapiens cDNA aJ13090 fis, clone NT 


753 




440995 


T57773 


Hs.10263 


ESTs 


753 




448706 


AW291095 


Hs21814 


interteukin 20 receptor, alpha 


752 




410227 


AB009284 


Hs51152 


exostoses (muttipleHkB 2 


7.49 


20 


431616 


AA508552 


Hs.195839 


ESTs, Weakly similar to 138022 hypotheti 


7.46 


434217 


AW014795 


Hs23349 


ESTs 


7.44 




431467 


K71831 


K&256398 


Homo sapiens mRNA; cDNA DKFZp434E0528 {f 


7.42 




448519 


AW175665 


K&244334 


Homo sapiens prostein mRNA, complete cds 


7.42 




446791 


AI632278 


H&34981 


ESTs 


7.40 


25 


419743 


AW408762 


Hs.127478 


Homo sapiens clone 24416 mRNA sequence 


759 


445855 


BE247129 


Hs.145569 


ESTs 


736 




425211 


M18667 


Hs.1807 


progastrlcsin (pepsinogen C) 


755 




419131 


AA406293 


H&301622 


ESTs 


754 




400294 


N95796 


Hs.179809 


Homo sapiens prostein mRNA, complete cds 


753 


30 


441736 


AW292779 


Hs.169799 


ESTs 


728 


427701 


AA411101 


Hs221750 


nuclear autoantigenlc sperm protein (his 


7.24 




457733 


AW974812 


Hs291971 


ESTs 


724 




418432 


M14156 


H&85112 


Insulin-Bee growth factor 1 (somatomedi 


722 




441201 


AW1 18822 


HS.12B757 


ESTs 


721 


35 


419953 


BE2G7154 


Hs.125752 


ESTs 


720 


419991 


AJ000098 


H&94210 


eyes absent (Drosophfla) homotog 1 


720 




425018 


BE245277 


Hs.154196 


E4FtnirKcription factor 1 


720 




424560 


AA158727 


Hs.150555 


protein predicted by clone 23733 


7.18 




435380 


AA879001 


Hs.192221 


ESTs 


7.14 




420658 


AW965215 


Hs.130707 


ESTs 


7.12 


40 


408291 


AB023191 


Hs.44131 


KIAA0974 protein 


7.10 




409110 


AA191493 


Hs.48778 


ruban protein 


7.10 




414485 


W27026 


Hs.182625 


VAMP (vesicle-associated membrane protel 


7.10 




430039 


BE253012 


HS.153400 


ESTs, WeaWy similar to ALU1 .HUMAN ALU S 


7.10 


45 


450832 


AW970602 


Hs.105421 


ESTs 


7.10 


417153 


X57010 


HSJ81343 


collagen, type tl, alpha 1 (primary osta 


7.08 




412446 


AI768015 


Hs52127 


ESTs 


7.07 




412953 


Z45794 


Hs238809 


ESTs 


7.06 




418051 


AW1 92535 


Ks.19479 


ESTs 


7.06 


50 


421566 


NMJ000399 


Hs.1395 


early growth response 2 (Krox-20 (Drosop 


7.04 


446999 


AA151520 


Hs279525 


hypomefcal protein MGC4485 


7.04 




440529 


AW207640 


Hs.16478 


Homo sapiens cDNA: FU21718 8s, clone C 


7.04 




441111 


AI806867 


Hs.126594 


ESTs 


7.01 




451027 


AW519204 


Hs.40808 


ESTs 


7.00 


55 


408432 


AW195262 




gb:xn67bQ5.x1 NCLCGAP.CML1 Homo sapiens 


7.00 


432223 


AA333283 


H&285336 


Homo sapiens, clone IMAGE3460280, mRNA 


7.00 




444805 


AB007899 


Hs.12017 


homolog of yeast ubiquitin-protein figas 


6.99 
6.98 




414212 


AA136569 


H&295940 


KlAA0187gane product 




431725 


X65724 


H&2839 


Nome disease (pseudogiloma) 


6.98 




449685 


AW29B669 


Hs.66095 


ESTs 


6.97 


60 


447313 


U92981 


Hs.18081 


Homo sapiens done OT1P1B6 mRNA, CAQ rep 


656 




424590 


AW966399 


H&46821 


hypothetical protein RJ20086 


654 




449655 


AI021987 


Hs59970 


ESTs 


652 




419563 


AA526235 


Hs.193162 


Homo sapiens cDNA FU11983 fis, done HE 


650 


65 


434163 


AW974720 


H&25206 


group XII secreted phosphoSpase A2 


659 


415809 


Z32789 


H&46601 


ESTs 


656 


425782 


U66468 


Ks.159525 


cell growth regulatory wfth EF-hand doma 


655 




4179K 


AA767382 


Hs.193417 


ESTs 


654 




427408 


AA583206 


H&2156 


RAR-retated orphan receptor A 


6.79 




445873 


AA250970 


Hs251946 


poly(AH)inding protein, cytoplasmic 1-1 


6.74 
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410718 


AI920783 


Hs.191435 


ESTs 


6.74 




432363 


AA534489 




gbxif76g1 1*1 NCLCGAP_Co3 Homo sapiens 


6.74 




438521 


AW203986 


H&213003 


ESTs 


6.73 




435604 


AA625279 


H&26892 


unchaiactBrized bona marrow protein BM04 


6.73 


5 


419083 


AI479560 


H&98613 


Homo sapiens cDNA FU 12292 fis, clone MA 


6.72 




418245 


AA088767 


H&83883 


transmembrane, prostate androgen induced 


6.70 




420714 


BE172704 


Ha222746 


WAA1610 protein 


6.70 




412707 


AW206373 


H&16443 


Homo sapiens cDNA: FLB1721 fis, done C 


657 


10 


421896 


N 52293 


Hs45107 


ESTs 


656 


411078 


AI222020 


Hs.182364 


CocoaCrisp 


656 




452465 


AA61Q211 


H&34244 


ESTs 


656 




422763 


AA033699 


H&83938 


ESTs, Moderately similar to MAS£_HUMAN M 


656 




444618 


AV653785 


H&300171 


ELL-RELATED RNA POLYMERASE II, ELONGAT10 


654 


15 


450164 


A1239923 


H&30098 


ESTs 


653 


431060 


AF039307 


H&249171 


homeobox A11 


652 




408031 


AA081395 


Hs42173 


Homo sapiens cDNA FU1 0368 fe, done NT 


652 




420285 


AA258124 


H&293878 


ESTs, Moderately simBar to ZN91_HUMAN Z 


6.62 




444670 


H58373 


Hs57494 


hypothetical protein MGC5370 


652 


20 


444489 


AI151010 


Hs.157774 


ESTs 


6.60 


445685 


AW779829 


H&263436 


gb^n88a05o1 NCLCGAPJOdll Homosapien 


6.60 




435677 


AA694142 


H&293726 


ESTs, Weakly similar to TSGA RAT TESTIS 


659 




452221 


C21322 


Hs.11577 


hypothetical protein HJ22242 


659 




431510 


AA580082 


Hs.112264 


ESTs 


656 


25 


415874 


AF091622 


H&78893 


KIAA0244 protein 


654 


418405 


AI868282 


Hs.11898 


ESTs, Highly similar to K1AA1 370 protein 


654 




452768 


AW069459 


H&61539 


ESTs 


654 




401451 








652 




416289 


W26333 




ESTs 


652 


30 


431778 


AL080276 


Hs268562 


regulator of G-proteln signalling 17 


651 


409089 


NMJM4781 


HS50421 


KIAA0203 gene product 


650 




442633 


AA328153 


Hs.88201 


ESTs, WeaWy similar to A Chain A, Cryst 


650 




431992 


NMJJ02742 


H&2891 


protein Idnase C, mu 


6.49 




418833 


AW974899 


Hs292776 


ESTs 


6.48 


35 


429163 


AA884766 




gbam20a10.s1 SoaresJIFULGBC_S1 Homos 


6.46 


430403 


AF039390 


H&241382 


tumor necrosis factor (Bgand) superfaml 


6.48 




443058 


AW451642 


Hs.16732 


ESTs 


6.46 




418564 


AA631143 


Hs.179809 


Homo sapiens prosteln mRNA, complete cds 


644 




432674 


AA841092 


H&257339 


ESTs, Weakly similar to 138022 hypotheti 


6.44 




423600 


AI633559 


Hs2907B 


ESTs 


6.44 


40 


404253 








6.42 




433610 


AA806822 


Hs.112547 


ESTs 


6.42 




421552 


AF026692 


Hs.105700 


secreted frizzted-related protein 4 


641 




407118 


AA156790 


HS262036 


ESTs, Weakiy similar to Z223_HUMAN ZINC 


640 


45 


408608 


N79738 


H&136102 


WAA0853 protein 


6.40 


421452 


AI925946 


Hs.104530 


fetal hypothetical protein 


640 




433285 


AW975944 


Hs237396 


ESTs 


6.40 




434926 


BE543269 


Hs50252 


mitochondrial ribosomal protein L32 


640 




446189 


H85224 


Hs214013 


ESTs 


640 


50 


416806 


NMJXH288 


Hs.79993 


peroxisomal biogenesis factor 7 


658 


416467 


H57585 


Hs57467 


ESTs 


656 




453403 


BE466639 


Hs.61779 


Homo sapiens cDNA FU1 3591 fis, done PL 


654 




429769 


NM.004917 


HS216366 


kaUDcretn 4 (prostase, enamel matrix, p 


654 




423642 


AW452650 


Hs.157148 


hypothetical protein MGC13204 


652 


55 


425843 


BE313280 


Hs.159627 


death associated protein 3 


652 


439221 


AA737106 


H&32250 


ESTs, Moderately similar to 178885 serin 


652 




428194 


AA765603 


Hs.180877 


H3hlstone, family 3B(H35B) 


650 




431958 


X63629 


H&2877 


cadherin 3, type 1, F-cadherin (placenta 


650 




439366 


AF100143 


Hs.6540 


fibroblast growth factor 13 


650 


60 


452789 


AW081626 


H&242561 


ESTs 


650 


416838 


D54745 


H&8Q247 


cholecystoWnbi 


650 




436962 


AW377314 


Hs5364 


DKFZP564I052 protein 


629 




433383 


AP034837 


Hs.192731 


double-stranded RNA specific adenosine d 


629 




418638 


AW749855 




gb:GV4-BT6534-281299053-cQ5 BT0534 Homo 


626 


65 


450728 


AW162923 


H&25363 


presenffin 2 (Alzheimer disease 4) 


625 


44038 


AIQ04193 


H&22123 


ESTs 


624 




453745 


AAS52989 


Hs.63908 


hypothetical protein MGC14726 


624 




426595 


AW971980 


Hs.62402 


p21/Cdc42/Racl4Cftratad Idnase 1 (yeast 


624 




444412 


A1147652 


Hs216381 


Homo sapiens done HH409 unknown mRNA 


624 




413384 


NMJD00401 


Hs.75334 


exostoses (multiple) 2 


622 
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426320 


W47595 


Hs.1 69300 


transforming growth factor, beta 2 


o\Z2 




423349 


AF010258 


Hs.127428 


homeoboxA9 


D-20 




429165 


AW009888 


Hs.118258 


prostate cancer associated protein 1 


6.16 




424800 


AL03558B 


Hs.153203 


MyoO family Inhibitor 


0.10 


r 
J 


409564 


AA045857 


H&54943 


fracture callus 1 (rat) homdog 


ft 442 
0,10 




438796 


W67821 


Hs.1 09590 


geneftonh 1 


6.16 




425451 


AF242769 


Hs.157461 


mesenchymal stem ceo proton DSC54 


6.14 




451663 


A! 872360 


H&2Q9293 


EST8 


0.14 


i n 
1U 


413623 


AA825721 


K&246973 


ESTs 


6.12 


452232 


AW020603 


Hs.271698 


radial spoke protein 3 


o.iZ 




453390 


AA862496 


K&28482 


ESTs 


0.12 




435542 


AA687376 


H&269533 


ESTs 


6.12 




420424 


AB033038 


H&97594 


KIAA1210 protein 


6.11 


15 


407103 


AA424881 


Hs.256301 


hypothetical protein MGC13170 


6.10 


409734 


BE161684 


Hs£6155 


nypotneucaJ protein 


6.10 




432686 


BE223007 


Hs.152460 


Homo sapiens cONA FU12909 fis, dona NT 


6.10 




438361 


AA805666 


Hs.1 46217 


Homo sapiens cONA: RJ23077 6s, clone L 


6.10 




411479 


AW848047 




gML3-CT021 4-291 299-052-A12 CT0214 Homo 


6.10 


20 


438849 


W28948 


Hs.10762 


ESTs 


6.08 


452726 


AF188527 


HSJ61661 


ESTs, WeaJdy similar to AF174605 1 F-box 


6.08 




445895 


D29954 


Hs.1 3421 


KIAA0056 protein 


6.08 




440774 


AI420611 


Hs.127832 


ESTs 


6.07 




422583 


AA410506 


Hs.1 18578 


K1AA0874 protein 


6.06 


25 


427500 


AW970017 


H&293948 


ESTs, WeaJdy similar to S65657 alpha-l O 


6.04 


443646 


AI085198 


Hs^ 88699 


ESTs 


6.04 




410566 


AA373210 


Hs.43047 


Homo sapiens cDNA FU 13585 fis, clone PL 


6.02 




417845 


AL1 17461 


Hs32719 


Homo sapiens mRNA; cONA DKFZp586F1822 (f 


6.02 




430273 


AI311127 


Hs.125522 


ESTs 


6.02 


30 


434782 


AA649253 


Hs.132458 


ESTs 


6.01 


442490 


AW965078 


Hs30212 


thyroid receptor Interacting protein 15 


6.01 




420026 


AI831190 


Hs.166676 


ESTs 


6.00 




437782 


AI370876 


Hs.123163 


exponm 1 (CRM1 , yeast, homotog) 


6.00 




447359 


NM.012093 


Hs.18268 


adenylate kinase 5 


6.00 


35 


447713 


AW20733 


Hs£O7083 


ESTs 


6 DO 


451073 


AI758905 


H&206063 


ESTs 


6iX) 




451640 


AA195601 


H&26771 


Human DNA sequence from done 747H23 or 


6 DO 




410889 


X91662 


HSJ66744 


twist (DrosophSa) homotog (acrocephalos 


5J7 




441222 


A1277237 


Hs.44203 


hypothetical protein FU23153 


5J6 


iA 

40 


447732 


AI758398 


Hs.161318 


ESTs 


5^6 


437756 


AA767537 


Hs.197096 


ESTs 


5S5 




408K9 


NM.006042 


H&48384 


heparan sulfate (glucosamine) 3-O-suffot 


5.94 




453911 


AW503857 


Hs/007 


Sarcote mmal-assodate d protein 


5^4 




414085 


AA1 14016 


Hs.75746 


aldehyde dehydrogenase 1 family, member 


5J93 


45 


408875 


NM.015434 


H&48604 


DKFZP434B168 protein 


552 


439451 


AP086270 


H&278554 


heterochro matin -iike protein 1 


5J92 




423853 


AB011537 


Hs.133466 


s&tprosophUa) homoiog 1 


S^1 




453060 


AW294092 


H&21594 


hypothetical protein MGC15754 


5j91 




420407 


AA814732 


Hs.145010 


llpopolysaocarkle-speci^ 


5^1 


50 


450480 


X82125 


H&2504O 


zhc finger protein 239 


5:90 


408446 


AW450669 


H&45068 


hypothetical protein DKFZp434l143 


&B8 




421039 


NMJJ03478 


Hs.101289 


cuBnS 


5JB8 




451684 


AF216751 


Hs26813 


COA14 


5.B8 




436063 


AK000028 


HS250867 


ribosomal protein S24 


5^8 


55 


410507 


AA355288 


H&271408 


transitional epiihelia response protein 


5^6 


420179 


N74530 


H&21168 


ESTs 


5^4 




453878 


A Wa 54440 


Hs, 19025 


UU3Z 






452270 


AW975014 


Hs26 


ferrochelatass protoporphyria) 


533 




435867 


AAS54229 


Hs.1 14052 


ESTs 


532 


60 


417683 


AW566008 


H&239154 


ankyrin repeat, family A (RFXANK-Iike), 


532 


432005 


AA524190 


Hs.120777 


ESTs, Weakly similar to ELU_HUMAN RNA P 


531 




406815 


AA833930 


H&288036 


tRNA isopentenylpyrophosphato transferas 


530 




437980 


R50393 


Ha278438 


KIAA1 474 protein 


530 




425856 


AA364908 


H&98927 


hypometical protein FU13993 


5.79 


65 


400301 


X03635 


H&1657 


estrogen receptor 1 


5.78 


446261 


AA313893 


Hs.13399 


hypothetical protein FU12615 similar to 


5.78 




410141 


R07775 


Hs287657 


Homo sapiens cONA FU21291 6s, clone C 


5.77 




427258 


AA400091 


Hs39421 


ESTs 


5.76 




419108 


AA389724 


Hs.191264 


ESTs, WeaMy simter to ALU? JWMAN AW S 


5.76 




442029 




H&14456 


neural precursor cell expressed, develop 


5.76 
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407783 


AW996872 


Hs.172028 


a cfislrrtegrin and metaDoproteinase doma 


5.7S 




434408 


A1031771 


Hs.132588 


ESTs 


5.74 




415077 


L41607 


K&934 


glucosamine (N-acety!) transferase 2, 1 


5.74 




432435 


BE218886 


Ks.262070 


ESTs 


5.74 


c 

J 


433313 


W20128 


KS296039 


ESTs 


5.73 




431740 


N75450 


Hs.183412 


ESTs, Moderately similar to AF1 16721 67 


5.73 




412991 


AW949013 




gb:QV4-FT0005-110500-201-e12 FTD005 Homo 


5.72 




418852 


BE537037 


H&273294 


hypothetical protein FU20069 


5.72 


1 A 
10 


418862 


NM_004996 


H&89433 


ATP-clndlng cassette, sub^amDy C (CFTR 


5.72 


446887 


ABO 07891 


Hs.16349 


KIAA0431 protein 


5.72 




437886 


AA1 56781 


HSj83992 


meteDothianeln 1E (functional) 


5.72 




410232 


AW372451 


Hsj61184 


CGI-79 protein 


5.70 




414452 


AA454038 


Hs29032 


ESTs 


5.70 


15 


422782 


AL031320 


Hs.119976 


Human DNA sequence from done RP1-20hE o 


5.70 


428730 


AA625947 


H&25750 


ESTs 


5.70 




431571 


AW500486 


Hs.180810 


spScfng factor proCne/glutambe rich ( 


5.70 




433393 


AF038564 


Hs58074 


tehy (mouse homolog) E3 ubiquffin prote 


5.70 




450816 


AL133067 


H&252U 


hypothetical protein 


5.70 


20 


443774 


AL1 17428 


Hs.9740 


DKFZP434A236 protein 


5.69 


446100 


AW967109 


Hs.13804 


hypoflietical protein dJ4620232 


5.69 




419168 


AI336132 


Hs53718 


Homo sapiens cONA FU12641 fe, done NT 


5.68 




416653 


AA768553 


Hs.77498 


metaDothbnein 1E (functional) 


5.67 




452679 


Z42387 


Hs.4299 


transmembrane, prostate androgen induced 


5.66 


25 


450244 


AA007534 


Hs.125062 


ESTs 


5.66 


408621 


AI970672 


H&46638 


chromosome 11 open leading frame 8 


5.65 




450325 


AI935962 


H&26289 


ESTs 


5.65 




439671 


AW1 62840 


H&6641 


Knesin family member 5C 


5.64 




452387 


AI680772 


Hs4316 


trinucleotide repeat containing 12 


5.64 


30 


413992 


W26276 


Hs.136075 


RNA, U2 small nudear 


5.63 


444151 


AW972917 


Hs.128749 


alpha- methylacyKkiA racemase 


5.63 




417791 


AW965339 


Hs.1 11471 


ESTs 


5.62 




410186 


AI936442 


HS59B38 


hypothetical protein FU10808 


5.60 




415123 


D60925 




ESTs 


5.60 


35 


429170 


NM.001394 


Hs.2359 


dual sperificity phosphatase 4 


5.60 


434415 


BE177494 




gbflC6W0596-270300O1 1-O05 HT0596 Homo 


5.60 




440738 


AI004650 


HS225574 


WD repeat domain 9 


5.60 




443830 


AI142095 


Hs.143273 


ESTs 


5.60 




446803 


A1655662 


Hs.197698 


ESTs 


5.60 


40 


414342 


AA742181 


Hs.75912 


KIAA0257 protein 


5.59 


422634 


NML016010 


Hs.1 18821 


CGI-62 protein 


556 




435047 


AA454985 


H&54973 


<a<fterin-Gke protein VR20 


5.55 




400268 








555 




452055 


A1377431 


HSJ293772 


hypothetical protein MGC10858 


534 


45 


437073 


AI885608 


H&94122 


ESTs 


554 


434072 


H70854 


H&283059 


Homo sapiens PRO1082 mRNA, complete cds 


553 




418339 


AA639902 


Hs.104215 


ESTs, Moderately similar to SPCN_HUMAN S 


552 




434551 


BE387162 


H&280858 


ESTs, Highly similar to A35661 DNAextis 


552 




439569 


AW6Q2166 


H&222399 


CEQP1 protein 


551 


50 


441102 


AA973905 


Hs.1 6003 


intermediate filament protein synco3m 


550 


448310 


AI480316 




gbim26h09jd SoaresJIRJLGBCLSI Homo s 


550 




413173 


BB076928 


Hs.70980 


ESTs 


5.48 




436246 


AW450963 


Hs.1 19991 


ESTs 


5.48 




449300 


AI656959 


Hs.222165 


ESTs 


5.46 


55 


452823 


AB012124 


H&30696 


transcription factor-like 5 (basic helix 


5.48 


451403 


AA885569 


Hs.15727 


Homo sapiens cONA FU14511 fis, clone NT 


5.46 




417061 


AI675944 


Hs.188691 


Homo sapiens cdna RJ izuoo trs, done He 


e a a 
0.44 




429126 


AW172356 


H&99083 


ESTs 


5.44 




431316 


AA5Q2663 


Hs.145037 


ESTs 


5.44 


60 


439162 


AW970536 


Hs.105413 


ESTs 


5.44 


431938 


AA938471 


Hs.1 15242 


specific granule protein (28 kDa); cysts 


544 




451552 


AA047233 


Hs53810 


ESTs 


5.43 




416991 


N36389 


Hs£95091 


KIAA0226 gene product 


542 




427638 


AA406411 


H&208341 


ESTs, WeaWy similar to KIAA0989 protein 


542 


65 


427718 


AI798680 


H&25933 


ESTs 


542 


438710 


AA833907 


Hs.178724 


ESTs, Weakly sbnQar to ALUIJiUMAN ALU S 


542 




406076 


AL390179 


Hs.1 37011 


Homo sapiens mRNA; c0NADKFZp547P134 (fr 


540 




431263 


AW129203 


Hs.13743 


ESTs 


540 




421264 


AL039123 


H&103042 


rrtlcrotubde-associaied protein 1B 


558 




421685 


AF189723 


Hs.106778 


ATPase, Ca++ transporting, type 2C, memb 


557 
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408460 


AA054726 


HS285574 


ESTs 


5.38 




409091 


AW97038S 


HS269423 


ESTs 


5.36 




421987 


AI133161 


HS286131 


CQM01 protein 


5.38 




426002 


AA418703 




0b2v88cQ3*1 Soares_NhHMPu_S1 Homo sapl 


5.38 


5 


441217 


AI922183 


H&213246 


ESTs 


5.38 




426006 


R4S031 


H&22627 


ESTs 


5.35 




422806 


BE314707 


Hs.1581 


glutathione S-transferase theta 2 


5.34 




432281 


AK001239 


H&274263 


hypothetical protein FU 10377 


522 


10 


451982 


F13036 


Hs27373 


Homo sapiens mRNA; cONA DKFZp56401763 (f 


522 


421129 


6E439899 


H&89271 


ESTs 


5.31 




444042 


NMJ0O4915 


Hs.10237 


ATP-btnding cassette, suWamOy G (WHIT 


521 




410150 


AW382942 


Hs.6774 


ESTs 


5.30 




423952 


AW877787 


H&136102 


KIAA0853 protein 


5.30 


15 


452822 


XB5689 


Hs288617 


hypothetical protein RJ22621 


5.30 


447752 


M73700 


H&347 


iacrotransrenin 


529 




441766 


R53790 


H&23294 


hypothetical protein RJ14393 


529 




431359 


AW993522 


Hs292934 


ESTs 


527 




427212 


AW293849 


KSJ58279 


ESTs, Weakly similar to ALU7 JiUMAN ALU S 


527 


20 


449916 


T60525 


H&299221 


pyruvate dehydrogenase kinase, isoenzyme 


527 


454014 


AW016670 


H&233275 


ESTs 


527 




419714 


AA758751 


Hs£8216 


ESTs 


525 




Amis 


AL157579 


H&153810 


K1AA0751 gene product 


526 




417333 


AL157545 


Hs.42179 


bromodomain and PHD finger containing, 3 


524 


25 


419986 


A1345455 


Hs.78915 


GA-binding protein transcription factor, 


524 


407182 


AA312551 


H&230157 


ESTs 


522 




420111 


AA255652 




gb2s21h11j1 NCLCGAP.GCB1 Homo sapiens 


522 




426058 


AI821625 


Hs.191602 


ESTs 


522 




459551 


Af472806 




gb^70e07jc1 $oamJ1SFJBJWJJTJ>AJ>_S 


522 


30 


432524 


AM58020 


H&293287 


ESTs 


522 


436207 


AA334774 


Hs.12845 


hypofretical protein MGC13159 


522 




410870 


U81599 


H&66731 


homeoboxB13 


522 




451418 


BE387790 


H&26369 


hypothetical protein FU2Q287 


522 




409757 


NM_001898 


Hs.123114 


cystaflnSN 


52\ 


35 


441124 


T07717 


Hs.1 19563 


ESTs 


52\ 


428593 


AW207440 


Hs.185973 


degenerative spermatocyte (homotog Droso 


52\ 




436401 


A1087958 


H&29088 


ESTs 


520 




437113 


AA744693 




gb:ny26c10.s1 NCLCGAP.GCB1 Homo sapiens 


520 




450947 


AI745400 


H&204662 


ESTs 


520 


40 


453279 


AW893940 


H&59698 


ESTs 


520 


445467 


A1239B32 


Hs.15617 


ESTs, Weakly similar to ALU4 JiUMAN ALU S 


5.19 




448944 


AB014605 


H&22599 


atrophin»1 Interacting protein 1; actM 


5.19 




412198 


AA937111 


H&69165 


ESTs 


5.18 




422646 


K87863 


Hs.151380 


ESTs, Weakly similar to T1 6584 hypoftefi 


5.18 


45 


438986 


AP085888 


H&269307 


ESTs 


5.18 


453954 


AW118336 


Hs.75251 


DEAD/H (Asp^ihAla-Asp/His) box binding 


5.18 




447541 


AK000288 


Hs.18800 


hypothetical protein RJ20281 


5.18 




434029 


AA621763 


Hs.170434 


Homo sapiens cONA RJ14242 fe, clone OV 


5.16 




459294 


AW977286 


Hs.169531 


RBPI-CkB protein 


5.16 


50 


429441 


AJ224172 


Hs204096 


. ftwphffin B (uterogtobin tamBy membar) 


5.16 


424692 


AA429834 


Hs.151791 


KIAAD092 gene product 


5.15 




4Z7359 


AW020782 


Hs.79881 


Homo sapiens cDNA: FU23006 fis, done L 


5.15 




419872 


AI422951 


Hs.146162 


ESTs 


6.15 




429422 


AKD01494 


H&202596 


Homo sapiens cDNA FU10632 fe, clone NT • 


5.14 


55 


448902 


Z45998 


H&22543 


Homo sapiens mRNA; cONA DKFZp761 11 912 (f 


5.14 


459065 


N23235 


Hs-30567 


ESTs, Weakly simBarto B34087 hypothati 


5.14 




431318 


AA5Q2700 


H&293147 


ESTs, Moderately simQar to A46010 X-Sn 


5.14 




452953 


AI932B84 


H&271741 


ESTs, WeaWy similar to A46010 X-Cnxed 


5.13 




426372 


AKD00684 


H&183887 


rrypothefJcal protein FLJ22104 


5.12 


60 


434401 


A1864131 


Hs.71119 


Putative prostata cancer tumor suppresso 


5.12 


416434 


AW1 63045 


Hs.79334 


nuclear factor, mterteukm 3 regulated 


5.11 




410288 


AA316181 


H&61635 


six transmembrane epithelial antigen of 


5.10 




• 417517 


AF001176 


H&82238 


POP4 (processing of precursor r S. cerev 


5.10 




453616 


NM_003462 


H&33846 


dynein, axonemal, tight intermediate po) 


5.10 


65 


427958 


AA418000 


H&98280 


potassium intarmadiata/smafl conductance 


5.09 


407945 


X69208 


H&606 


ATPase, Cu++ transporting, alpha poiypep 


5.08 




425154 


NM.001851 


H&154850 


collagen, type IX, alpha 1 


5.08 




412863 


AA121673 


H&39757 


zinc finger protein 281 


5.06 




420807 


AA280627 


H&57846 


ESTs 


5.06 




430568 


AA769221 


HS270847 


defta-tubuGn 


5.06 
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433887 AA743991 gbJiy57gOU1 KCLCQAP_Pr18 Homo saptens 5.06 

438375 AW015940 H&232234 ESTs 5.06 

418092 R45154 Hs.106604 ESTs 5.06 

418576 AW968159 H&289104 Atu-bfnding protein with zinc finger dam 5.05 

5 413328 Y15723 Hs.75295 guanyiate cyclase 1, soluble, alpha 3 5.04 

414271 AK000275 Hs.75871 protein kinase C binding protein 1 5.04 

432729 AK000292 H&278732 hypofteticaJ protein RJ20285 5.04 

433433 Al 692623 H&121513 Homo sapiens clone Z3-1 placenta expres 5.04 

439662 H97552 H&269060 ESTs 5.04 

10 439743 AL3B9956 H&283858 Homo sapiens mflNA full length Insert cDN 5.04 

417511 AL049176 Hs.62223 chordM® 5.02 

437814 AI088192 Hs.135474 ESTs, WeaWy similar to DDX9.HUMAN ATP-0 5.02 

426342 AF093419 Hs.169378 multiple PDZ domain protein 5.02 

429782 NM.005754 HsJ220689 Ras-GTPase-activating protein SH3^omam 5.02 

15 429975 AI167145 H&165538 ESTs 5.02 

436209 AW850417 H&254020 ESTs, Moderately similar to unnamed prot 5.02 

438571 AW020775 Hs56022 ESTs 5.02 

450223 AA418204 H&241493 natural kffl^tomor recognition sequenc 5.02 

408267 AW380525 H&267705 tubulln-specfflc chaperone e 5.01 

20 417730 Z44761 gb:HSC28R)61 normateed infant brain cDN 5.00 

425465 L18964 Hs.1904 protein kinase C, fota 5.00 

430599 NMJ004855 H&247118 phosphatidyfinoslto! gtycan, class B 5.00 

450961 AW978813 Hs25Q867 metatothbnein 1E (functional) 5.00 

451386 AB029006 H&26334 spastic paraplegia 4 (autosomal dominant 5.00 

25 420380 AA640891 Hs.102406 ESTs 4.99 

424947 R77952 H&239625 ESTs, Weakly similar to alternatively sp 4.99 

442653 BE269247 Hs.1 70228 gb:601185486F1 NIH_MGC_8 Homo sapiens cD 4.98 

457211 AW972565 Hs52399 ESTs,WeaWyslnrfIajtoS51797vasodilat 4.07 

425851 NMJJ01490 Hs.159642 glucosamine (N-acetyl) transferase 1, c 4.97 

30 446279 AA490770 Hs.182382 ESTs 4.86 

433377 A1752713 H<U3845 ESTs 4.96 

450218 R02018 Hs.1 68640 ankylosis, progressive (mouse) homotog 4.96 

412715 NNL000947 Hs.74519 primase, polypeptide 2A <58kD) 4.94 

448164 R61680 H&26904 ESTs,M«JerateIyslmaartoZ195_HUMANZ 4.94 

35 420121 AW968271 Hs.191534 ESTs, WeaWy slniar to ALU1_HUMAN ALU S 454 

421689 N87820 Hs.1 06826 KIAA1 696 protein 4.93 

445808 AV655234 H&298083 ESTs, Moderately similar to PC4259 feni 4.92 

416533 BE244G53 Hs.79362 retincbfestoma-Oke 2 (p130) 4.92 

418049 AA211467 Ha 190488 Homo sapiens, SlmDar to nuclear localiz 4.92 

40 436039 AW023323 Hs.121070 ESTs 4.92 

432653 N62096 H&293185 ESTs, WeaWy similar to JC732B amino ad 4.91 

420324 AF163474 Hs.96744 prostate androgen-regulated transcript 1 4.91 

403047 451 

436899 AA764852 Hs291567 ESTs 450 

45 431117 AF003522 H&250500 deita (Drosophfla)-Bke 1 450 

427617 D42063 Hs.178825 RAN binding protein 2 4.68 

428604 AK000713 Hs.1 93736 hypothetical protein RJ20706 4.68 

433050 AI093930 Hs.1 63440 Homo sapiens cDNA: FU21000 Gs, clone C 4.88 

418575 AA225313 H&222886 ESTs, Weakly simSar to TRHYJiUMAN TRiCH 4.86 

50 432615 AA557191 H&55Q28 ESTs, Weakly similar to 154374 gene NF2 456 

412652 AB01777 Hs.6774 ESTs 456 

432473 AI202703 Hs.152414 ESTs 4.88 

449071 NM_005872 H&22960 breast carcinoma amplified sequence 2 - 456 

450654 AJ245587 Hs25275 Kruppei-type zinc finger protein 455 

55 418866 T65754 Hs.100489 gkyc11c07.s1 Stratagene lung (937210) H 455 

407596 R86913 gb:yq30f05.r1 Soares fetal Over spleen 454 

456516 BE172704 H&222746 WAA1 610 protein 4.84 

426501 AW043782 Hs293616 ESTs 4.84 

448730 AB032933 Hs^1894 K1AA1 157 protein 454 

60 458339 AW976853 Hs.172843 ESTs 453 

422083 NM.001141 Hs.1 11256 arachidonata 15-Gpoxygenase, seoond typ 452 

420159 AI572490 Hs59785 Homo sapiens cDNA: RJ21245 fis, clone C 452 

424103 NMJD01918 Hs.139410 dftrydrofipoamide branched chain transacy 452 

449535 W15267 H&23672 tow densSy lipoprotein receptor-related 452 

65 422048 NM.012445 Hs*88126 sporufin 2, extracellular matrix protein 452 

416737 AF154335 Hs.79691 UM domain protein 452 

419972 AL041465 Hs594038 floig>67 451 

420235 AA256756 Hs51178 ESTs 451 

423412 AF109300 Hs.147924 prostate cancer associated protein 5 . 450 
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429598 AA811257 H&269710 ESTs 4X0 

457114 AI821625 Hs.191502 ESTs 4X0 

421828 AW891965 H&289109 hJstone deacetylase 3 4.79 

424602 AK0Q2055 HsX01129 hypothec protein FL/t 1193 4.78 

5 428364 AA428565 Hs.160541 ESTs, Moderately similar to ALU1JHUMAN A 4.78 

452335 AW188944 HsX1272 ESTs 4.78 

410765 AI694972 HsX6180 nucteosome assembly protein 1-Gke 2 4.77 

421040 AA715026 Hs.135280 ESTs 4.76 

_ 421518 AI056392 Hs208819 ESTs 4.76 

10 452560 BE077084 ESTs 4.76 

409752 AW963990 gb:EST376063 MAGE resequences, MAQH Homo 4.75 

439703 AF0B6538 Hs.186245 ESTs 4.75 

418836 AI655499 Hs.161712 ESTs 4.74 

^ 450642 R39773 Hs.7130 copmalV 4.74 

15 419879 Z17805 H&93564 Homar, neuronal immediate early gene, 2 4.74 

411440 AW749402 gb:OV4-BT0383-281299O81-c08 BT0383 Homo 4.74 

450649 MM_001429 Hs^97722 E1 A binding protein p300 4.74 

408738 NM.014785 H&47313 WAA0258 gene product 4.73 

435020 AW505076 HsX01855 DiGeorge syndrome critical region gene 8 4.72 

20 411624 BE145964 K1AA0594 protein 4.72 

439360 AA448488 HsX5346 rtbosomal protein U4 4.72 

440491 R35252 Hs.24944 ESTs, Weakly similar to 21 09260AB cell 4.72 

442611 BE077155 Hs. 177537 hypothetical protein DKFZp761B1514 4.72 

443555 N71710 Hs.21398 ESTs, Moderately similar to A Chain A, H 4.72 

25 453800 BE300741 H&288416 hypothetical protein RJ1 3340 4.72 

457528 AW973791 Hs*82784 ESTs 4.72 

416795 AI497778 Hs. 168053 HBV pX associated proteJn-8 4.71 

407302 R74206 Hs.268755 ESTs, Weakly simCar to 178885 serine/th 4.71 

_ 404721 4.70 

30 426261 AW242243 Hs.168670 peroxisomal famesytated protein 4.70 

431924 AK000850 Hs.272203 Homo sapiens cDNA FL&0843 fis, clone AD 4.70 

435256 AF193768 Hs.13872 cytotdne-iike protein C1 7 4.70 

438295 AI394151 HsX7932 ESTs 4.70 

442655 AWQ27457 HsX0323 ESTs, Weakly similar to B34087 hypotheti 4.70 

35 415788 AW628686 Hs.78851 K1AAQ21 7 protein 4X9 

442760 BE075297 Hs. 10067 ESTs, Weakly similar to A43932 mucin 2 p 4X9 

432432 AA541323 Hs.115831 ESTs ' 4X8 

454398 AA463437 Hs.11556 Homo sapiens cDNA FU12566 fis, done NT 4X8 

452741 BE382914 HsX0503 Homo sapiens cDNA FU1 1344 fis, done PL 4X7 

40 424853 BE549737 Hs.1 32967 Human EST clone 122887 mariner transposo 4X7 

419706 C04649 Hs.77899 tropomyosin 1 (alpha) 4X8 

412088 AI689496 Hs.108932 ESTs 4X5 

416276 U41060 Hs.79136 U\M protein, estrogen regulated 4X4 

429281 AA830856 Hs29808 Homo sapiens cDNA: FU21 122 fis, clone C 4X4 

45 448207 AI475490 Hs.170577 ESTs 4X4 

408374 AW02543O Hs.155591 forkhead box F1 4X4 

447162 BE328091 Hs.157396 ESTs, WeaWy similar to A46010 X-linked 4X4 

451900 ABQ23199 Hs^7207 WAA0982 protein 4X3 

421437 AW821252 Hs.104336 hypothetical protein 4X3 

50 418624 AI734080 Hs.104211 ESTs 4X3 

426172 AA371307 Hs.125058 ESTs 4X2 

439831 AW136488 H&25545 ESTs 4X1 

452994 AW962597 HsX1305 WAA1547 protein - 4X1 

457726 AI217477 Hs.194591 ESTs 4X0 

55 434629 AA789081 H&4Q29 gtioma-arnplified sequence-41 4X0 

403764 4X8 

410659 AI080175 HsX8826 ESTs 4X8 

432383 AK000144 H&274449 Homo sapiens cDNA FU20137 fis, done CO 4X8 

451246 AW189232 H&39140 cutaneous T-ceO lymphoma tumor anfigen 4X8 

60 433234 AB040928 Hs.65366 K1AA1 495 protein 4X7 

424983 AI742434 Hs.169911 ESTs 4X6 

437812 AI582291 Hs.16846 ESTs, WeaWy similar to 04HU01 debrisoqu 4X6 

438447 AI082883 Hs.1 57533 hypothetical protein RJ 13409; K1AA1711 4X5 

434715 BE005346 Hs.116410 ESTs 4X5 

65 447673 AJ823987 Hs.1 82285 ESTs 4X4 

408897 N50204 H&283709 SpoporysaccharUe specific response-7 p 4X4 

436645 AW023424 Hs.1 56520 ESTs 4X4 

421247 BE391727 Hs.102910 • general trartscripfion factor IIH, pdype 4X3 

450377 AB033091 H&24936 WAA1265 protein 4X3 
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416232 AA179233 Hs.42380 nasopharyngeal carcinoma susceptibility 454 

423740 Y07701 Hs.1 32243 ammopertfdase puromycin sensitive 424 

442023 AJ 187878 Hs.144549 ESTs 424 

- 426764 AA732524 Hs.1 51464 ESTs, Weakly similar to ALUCJflJMAN 423 

5 454058 AI273419 Hs.135146 hypofoetical protein FU13984 423 

456511 AA262330 Hs. 145668 ESTs 422 

448330 AL036449 Hs207163 ESTs 422 

424701 NMJH5923 Hs.151988 mitogan-activated protein kinase kinase 421 

432621 AI298501 Hs.12807 ESTs, Weakly similar to T46428 hypotheti 420 

10 445707 AI248720 Hs.114390 ESTs 420 

419910 AA662913 Hs.190173 ESTs, Weakly simfer to A46010 X-flnked 420 

424085 NMJ002914 Hs.139226 repBcation factor C (activator 1)2 (40 420 

440749 W22335 H&7392 hypothetical protein MGC31 99 420 

„ 442787 W93048 Hs227203 hypoftefcal protein MGC2747 420 

IS 443414 R54594 Hs25209 ESTs 420 

443556 AA256769 H&94949 methytmaionyKJoA epimerase 420 

444170 AW613879 Hs. 102403 ESTs 420 

446751 AA766988 Hs*5874 Human DNA sequence from done RP1 1 -16L21 420 

421041 N36914 H&14691 ESTs, Moderately similar to I3B022 hypot 4.19 

20 447476 BE293466 HS20880 ESTs, Weakly similar to I38022 hypotheti 4.19 

448543 AW897741 Hs21380 Homo sapiens mRNA; cONA DKFZp586P1 124 (f 4.18 

410294 AB014515 Hs288891 K1AA0615 gene product 4.18 

433607 AA602004 HS23260 ESTs 4.18 

^ 435552 A1668638 Hs.193480 ESTs, Moderately similar to ALU6_HUMAN A 4.18 

25 447124 AW976438 Hs. 17428 RBP1-&6 protein 4.18 

453308 AW959731 Hs .32538 ESTs 4.17 

439328 W07411 Hs.1 18212 ESTs, Moderately similar to ALU3_HUMAN A 4J6 

430473 AW130690 Hs299842 ESTs 4.16 

437257 AI283085 Hs290931 ESTs, Weakly similar to YFJ7.YEAST HYPOT 4.16 

30 438018 AK001160 Hs£999 hypothetical protein FU10298 4.16 

443857 AI089292 Hs287621 hypothetical protein FU14069 4.15 

446711 AF169892 Hs.12450 pmtocadherin 9 4.15 

419103 Z40229 Hs.96423 hypothetical protein RJ23033 4.14 

„ 405403 4.14 

35 407378 AA299264 ESTs, Moderately similar to (38022 hypot 4.14 

408986 AW298602 Hs.1 97657 ESTs 4.14 

418^ AA227609 H&94834 ESTs 4.14 

434400 AI478211 Hs.1 86896 Homo sapiens cDNA FU1 1417 fis, done HE 4.14 

438578 AA811244 Hs.164168 ESTs 4.14 

40 450459 AI697193 Hs299254 Homo sapiens cONA: FU23597 fis, done L 4.14 

429887 AW366286 Hs.1 45696 .spHdng factor (CCU) 4.13 

448148 NMJH6578 H&20509 HBV pX assodaled protefn-8 4.13 

450316 W84446 Hs.17850 rrypothefical protein MQC4643 4.12 

417531 NM.003157 Hs.1 087 serine/threonine kinase 2 4.12 

45 431592 R6S016 Hs293871 rrypothefical protein MGC1 0895s 4.12 

432463 AA548518 Hs. 186733 ESTs 4.12 

433613 AA836126 H&5669 ESTs 4.12 

434739 AA804487 Hs.144130 ESTs 4.12 

438259 AW205969 Hs.1 31 808 ESTs 4.12 

50 425810 A1923627 Hs3l903 ESTs 4.10 

432672 AW973775 Hs.1 30760 myosin phosphatase, target subunit 2 4.10 

433345 AI681545 H&152982 hypothetical protein HJ13117 4.10 

432712 AB016247 Hs288031 steroKiWesaturase (fungal ERG3, delta - 4.09 

453020 AL162039 H&31422 Homo sapiens n^cDNAOfff=Zp434M229(fr 4.09 

55 412045 AA099802 H&4299 transmembrane, prostate androgen Induced 4.09 

435114 AA775483 Hs288936 mitochondrial ribosomai protein U9 4j08 

443204 AW205878 Hs29643 Homo sapiens cONA RJ13103 Ms, clone NT 4.08 

445459 A1478629 Hs.1 58465 likely orthotog of mouse putative IKK re 4.08 

438938 H46212 Hs.137221 ESTs 4j07 

60 454119 BE549773 H&40510 uncoupling protein 4 4.06 

411000 N40449 H&201619 ESTs, Weakfy simitar to S38383 SEB4B pro 4.06 

418928 AA232658 H&B7070 UDP^tucc^:gr/ccprotein giucosyttransfe 4.06 

424432 AB037821 Hs.146858 protocadherin 10 4.06 

449673 AA0Q2064 Hs.18920 ESTs 4.06 

65 429299 AK20463 H&99197 hypothetical protein MGC1 3102 4j06 

42174 AL049325 Hs.1 12493 Homo sapiens mTOA; cONA DKFZp564O036 (fr 4j05 

455497 AA112573 Hs285691 Homo sapiens prostein mRNA, complete cds 4j05 

415138 C18356 Hs.78045 tissue tactorpathway Inhibitor 2 4A4 

402791 4j04 
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ESTs, Weakly similar to ALIWJIUMAN ALU S 


3,48 


435572 


AW975339 


H&239828 


ESTs, WeaJdy similar to GAG2_HUMAN RETRO 


3.47 


407192 


AA509200 




gbaf12e02^1 Scares Jestis_NHT Homo sap 


3.47 


413435 


X51405 


Ks.75360 


cerboxypeptidase E 


a46 


447210 


AF0352G9 


Hs. 17752 


phosphafldylsertrte-spedfc phosphollpas 


a46 


447958 


AW796524 


Hs£8644 


Homo sapiens microsomal signal peptidase 


3-46 


425312 


AA354940 


Hs.145958 


ESTs 


a46 


442007 


AA301116 


Hs.142838 


nucleolar phosphoprotein Nopp34 


3.46 


417455 


AW007066 


Hs.18949 


ESTs, Weakly similar to CA2B_HUMAN COLLA 


ZM 


426931 


NM.003416 


Hs2078 


zinc finger protein 7 (KOX 4, done HF.1 


ZAS 


408739 


W01556 


H&238797 


ESTs, Moderately simflar to 138022 hypot 


3-45 


436024 


AI800041 


Hs.190555 


ESTs 


3-45 


408418 


AW963897 


Hs.44743 


WAA1435 protein 


3-45 


409151 


AA306105 


Hs-50785 


SEC22, vesicle trafficking protein (S. c 


3>44 


418626 


AW299508 


Hs.135230 


ESTs 


3.44 


420560 


AW207748 


Hs59115 


ESTs 


ZM 


420686 


AI950339 


Hs.40782 


ESTs 


ZM 


428870 


AA436831 


H336049 


ESTs 


ZM 


436754 


AI061288 


Hs.133437 


ESTs 


ZM 


437860 


AI669586 


HS222194 


ESTs 


ZM 


452300 


AW628045 


HS28896 


Homo sapiens mRNA full length Insert cDN 


ZM 


421887 


AW161450 


Hs.109201 


CQW6 protein 


ZM 
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TABLE 5A shows the accession numbers for those primekeys lacking a unigenelD in Tables 
5, 6, and 7. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed Gene clusters were compiled using sequences derived from 
5 Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 
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Pkey: 

CAT number 
Accession: 



Unique Eos probeset identifier number 

Gene duster number 

Genbank accession numbers 



Pkey 


CAT number 


407596 


1003489J 


408432 


1058567J 


409752 


115301J 


409770 


1154048 J 


411440 


124577.1 


411478 


1247077J 


411624 


1252166J 


412991 


134248J 


414269 


143133.1 


415123 


1523390J 


415715 


1548818.1 


416288 


1585983.1 


416289 


1588037J 


417730 


1695795J 


418636 


1774Q2J 


419346 


184129.1 


419536 


185688.1 


420111 


190755J 


422219 


213547J 


424179 


236389.1 


424242 


237181J 


428002 


2856Q2J 


429163 


300543J 


432189 


342819J 


432340 


345248.1 


432363 


345469J 


432986 
433586 


356839J 
370470J 


433641 


37186J 



433687 
433891 
434415 
434565 
434804 
437113 
444168 
448212 
448310 
451746 



373061J 
376239.1 
385931.1 



I1J 
433234.1 
593829.1 
755099J 
757918.1 



R86913 R86901 H25352 R01370 H43764 AW044451 W21298 
AW1 95262 R27868 AW811262 

AW963990 AA078196 AW749482 AA077468 BE151 571 AA376917 
AW499536 AW499553 AW502138 AW499537 AW502136 AW501743 
AW749402 AW749403 Z45743 R80376 AA093358 

AW848047 AW848202 AW848631 AW848142 AW8487Q2 AW848121 AW848632 AW848140 AW848571 

AW848009 AW848067 AW848069 AW848905 AW848214 

BE145964 BE146286 AW854564 

AW949013AA126111 

AA298469AA137165 

D60925 D60828D80767 

F30354 F36559 T15435 

H51299 H44619 H46391 R86024 H51892 T72744 

W26333 R05358 H44682 

244761 R25801R11926R35604 

AW749855 AA225995 AW750208 AW750208 

AI830417AA236612 

AA603305 AA244095 AA244163 

AA255652 AA28091 1 AW967920 AA262684 

AW978073 AW978072 AA807550 AA306567 

F30712 F35665 AW263888 AI904014 AI904018 AA336927 AA336502 

AA337476 AW966227 AA450376 AW96Q222 AA381051 

AA418703AA418711 BB071915 BE071920BE071912 

AA884766 AW974271 AA592975 AA447312 

AA527941 AI810608 AI620190 AA635266 

AA534222AA632632 T81234 

AA534489 AW970240 AW970323 

AA650114 AW974148 AA572946 

T85301 AW517087 AA601054 BE073959 

AF080229 AF080231 AF080230 AF080232 AF080233 AF080234 BE550633 AI636743 AW614951 BE467547 
AI680833 AI633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW2068Q2 Al 970376 
AI583718 A1672574 N25695 AW665466 AI818326 AA126128 AM80345 AW013827 AA248638 AI214B68 
AA204735 AA207155 AA206262 AA204833 AW003247 AW496808 AI080480 A1631703 AI651 023 AI86741 8 
AW818140AA502500AI206199 AI671282AI352545BE501030AI652535BE465762AA206331 AW451866 
AA471088 AA206342 AA204834 AA206100 AW021661 AA332922 N66048 AA703396 H92278 AW1 39734 
H92683 U87589 U87595 K69001 U87594 BE46642Q AI624817 BE46661 1 AI206344 AA574397 AA348354 
AI493192 

AA743991 AA604852 AW272737 
AA613792 AW182329 TO5304 AWB58385 
BE177494 AW276909 AA632849 
TB2172AF147324 T52248 
AA649530 AA659316 H64973 
AA744693AW750059 
AW379879AI126285H12014 
AI475858AW969013 
AI480316AW847535 
M88178AI813822D56993 
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BE077084 AW139963 AW863127 AW806209 AW806204 AW806205 AW806206 AW80621 1 AW806212 
AW806207 AW806208 AW906210 AI907497 

AW838616 AW838660 BE1 44343 AI914520 AW888910 BE184854 BE184784 
AL133761 AL133767 

BE176479 BE176678 BE176357 BE176550 AW88S079BE1 76676 BE176616 BE176555 BE176469 BE176610 
BE176362 

AW894017 AW893B56 AW894032 
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TABLE 5B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Tables 5, 6, and 7. For each predicted exon, we have listed the 
genomic sequence source used for prediction. Nucleotide locations of each predicted exon 
are also listed 



Pkey: 
Ref: 



Imposition: 



Unique number corresponding to an Eos probeset 

Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et aT refers to the 
publication entitled The DNA sequence of human chromosome 22.' Dunham I. et at, Nature (1999) 402*89-495. 
indicates DNA strand from which axons were predicted, 
indicates nucleotide positions of predicted exons. 



Ptey 


Ref 


Strand 


NLposIUon 


401045 


8117619 


Pius 


90044-90184,91111-91345 


401424 


8176894 


Pius 


24223-24428 


401451 


6634068 


Minus 


119926-121272 


401714 


6715702 


Plus 


9648446681 


401747 


9789672 


Minus 


118596-118816,119119-119244,119609-119761,12^ 








131258,131866-131932,132451-132575,133580-134011 


401785 


7249190 


Minus 


16^6-165996,166189-166314,166408-168569,167112-167268,167387-167469,168634-1^ 


401819 


7467933 


Minus 


28217-28486 


402408 


9796239 


Minus 


- 110326-110491 


402444 


9796614 


Plus 


28391-28517 


402791 


6137008 


Minus 


51036-51207 


403047 


3540153 


Minus 


59798-59968 


403137 


9211494 


Minus 


92349-9257232958-93084^357943712^3949-94072,94591-94748,95214-95337 


403721 


7528046 


Minus 


156647-157366 


403764 


7717105 


Minus 


118692-118853 


403797 


8099896 


Minus 


123065-125008 


404165 


6826489 


Minus 


6902549128 


404210 


5006246 


Plus 


169926-170121 


404253 


9367202 


Minus 


55675-56055 


404561 


9795930 


Minus 


69039-70100 


404571 


7249169 


Minus 


112450-112648 


404721 


9856648 


Minus 


173763-174294 


404915 


7341766 


Minus 


100915-101087 


404939 


6862697 


Plus 


175318-175476 


405403 


6850244 


Minus 


37491-37670,40951-41031 


405685 


4508129 


Minus 


3795648097 


405718 


9795467 


Plus 


113080-113266 


405793 


1405887 


Minus 


8919749453 


405876 


6758747 


Plus 


89694-40031 


405917 


7712162 


Minus 


106829-107213 


406414 


9256407 


Phis 


4959349850 


406554 


7711566 


Plus 


106956-107121 
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TABLE 6:286 GENES ENCODING EXTRACELLULAR OR CELL SURFACE 
PROTEINS UP-REGULATED IN PROSTATE CANCER COMPARED TO 
NORMAL ADULT TISSUES 

Table 6 shows 286 genes up-regulated in prostate cancer compared to normal adult tissues 
that are likely to be extracellular or cell-surface proteins. These were selected as for Table 5 
and the predicted protein contained a structural domain that is indicative of extracellular 
localization (e.g. egf, 7tm domains). 







Unique Eos probeset Identifier number 




ExAccru 




Exemplar Acces 


sion number, Genbank accession number 




UnigenetD: 




Unigene number 






Unlaene Title: 




Unlgene gene tffle 




R1: 




Ratio of tumor to normal tissue 




Pkey 


ExAccri 


1 Intnonoin 

uniyeneiu 


1 InlnnAna T/Ma 

uningene ime 


R1 


409361 


NM 005982 




dno notiHe hnmonhnV /n mefflh flo \ hntnolA 

who uwik> iiuiirouuuA ^ufudu^nuay i imiiutu 


4828 


409731 


AA125985 


He 45 


thvnwtdn Koto iffontFftorf in naiimhfaet 

vtyiimsHiii uo ia, fUBfiuiiou in iiouruuicttK 


4524 


400298 


AA032279 


ns.Diooo 


soe uansmernDiane eptinetsj antigen ox 


4^43 


420154 


A1093155 




i)m07 nmtfiin 

uiVEsf proieui 


41.12 


426747 


AA535210 


He 1710QR 


KainKFBin o, prostate specinc anugen 


31 B0 


400299 


X07730 


Uo 171 QQR 


KaiuKrein a, (prostate specific anugen 


2451 


425075 


AA506324 


Uc 1QR9 
nS. lOOc 


acn pnospnatase, prostate 
neuropeptide Y 


2423 


424846 




ns.lo3Z 


2357 


405685 








2050 


420757 


X78592 


Me QQQ1K 


androgen receptor (dtiyoxotestosterone r 


19.72 


418994 


AA296520 




seiecun t ^encouieiiaj aonesion moiecui 


1956 


452792 


AB037765 




i\iA/\iiM4 protein 


17-39 


445472 


AB 006631 


Hs.12784 


Homo sapiens mRNA for K1AA0293 gene, par 


17.00 


414565 


AA5Q2972 


Hs.183390 


hypothetical protein RJ 13590 


16.82 


431716 


D89053 


H&268012 


fatyaaoX)oenzyrne A ligase, long-chain 


1650 


408430 


379876 


Hs.44926 


dipepudyfceptldase IV (CQ26, adenosine 


1628 


408000 


L11690 


Hs520 


bullous pemphigoid anugen 1 (23QS40kD) 


19 y3*r 


430226 


BE245562 


Hs2551 


adrenergic, beta-2-, receptor, surface 


1540 


444484 


AK002126 


Hs.11260 


hypotieticat protein RJ11264 


1476 


416601 


AA279490 


Hs56368 


catmegin 


1458 


448999 


AF179274 


Hs22791 


transmembrane protein wflh EGF- (Ike and 


1455 


416182 


NNL004354 


H&79069 


cycGnG2 


1254 


420544 


AA677577 


Hs.98732 


Homo sapiens Chromosome 16 BAG clone CIT 


12.79 


445413 


AA151342 


Hs.12677 


CGI-147 protein 


12.64 


. 453930 


AA419466 


H&36727 


hypothetical protein RJ 10903 


1222 


440286 


U29589 


Hs.7138 


cholinergic receptor, muscarinic 3 


1204 


452784 


BE463857 


Hs.151258 


hypothetical protein FU21062 


1156 


450203 


AFD97994 


H&301528 


L-kynunsnln^a!pha-aminoadIpate aminotra 


1158 


448045 


AJ297436 


Hs20166 


prostate stem caO antigen 


1151 


449650 


AF055575 


Hs23838 


calcium channel, voltage-dependent L ty 


11.18 


420381 


050640 


H&337616 


phosphodiesterase 3B, cGMP-inhibfted 


11.10 


425665 


AK001050 


Hs.159066 


hypothetical protein RJ10188 


11.08 


425710 


AF030880 


Hs.159275 


solute carrier family, member 4 


11.08 


428728 


NM.016625 


H&191381 


hypothetical protein 


11.04 


407021 


U52077 




gfoHuman manner! transposase gene, comp 


11.02 


410733 


D84284 


Hs.66052 


CD38 antigen (p45) 


1152 


452340 


NM_0022Q2 


H£L505 


ISL1 transcription factor, UMmomeodoma 


1055 


428819 


AL135623 


Hs.193914 


K1AA0575 gene product 


1048 


421991 


NMJH4918 


Hs.110488 


WAAQ990 protein 


1054 


431217 


NM.013427 


Hs250830 


Rho GTPase activating protein 6 


9.75 


421470 


R27496 


HS.1376 


annexinAS 


954 


409262 


AK000631 


H&52256 


hypoflteticai protein FU20624 


945 


435980 


AF274571 


Hs.129142 


deoxyrbonudease II beta 


924 


421246 


AW582962 


Hs.1 02897 


CGW7 protein 


920 


410001 


AB041036 


Hs57771 


kaffikreinll 


953 


441791 


AW372449 


Hs.175982 


hypothetical protein FU21159 


952 
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404571 








8.68 




456497 


AW967956 


Hs.123648 


ESTs, Weakly stntilar to AF1084601 ubinu 


8.53 




419968 


X04430 


Hs.93913 


InterleuHn 6 (Interferon, beta 2) 


8^5 




433172 


AB037841 


Hs.102652 


hypothetical protein ASH1 


8.30 


5 


422631 


BE218919 


Hs.1 18793 


hypothetical protein FU 10688 


827 




427674 


NM.003528 


Hs2178 


H2B histone famSy, member Q 


820 




404915 






8.08 




452259 


AA317439 


Hs28707 


signal sequence receptor, gamma (transto 


6.06 


10 


452691 


N75582 


Hs.2 12875 


ESTs, Weakfy simiiar to DYH9 JKJMAN CILIA 


8.02 


439731 


AI953135 


Hs.45140 


hypothetical protein FU 14084 


7.98 




419839 


U24577 


H&93304 


phosphofipase A2, group VQ (ptatelet^ac 


7.68 




420120 


ALD49610 


Hs. 95243 


transcription elongation factor A (SI!)- 


7.64 




424099 


AF0712Q2 


Hs. 139338 


ATP-bbiding cassette, sub-famBy C (CFTR 


7.64 


15 


448706 


AW291Q95 

niiw ivM 


K&21814 


Intertauldn 20 racentor aloha 


7.52 


410227 


AB009284 


H&61152 


exostoses (rnuttip!e)-fifoe 2 


7.49 




425211 


M18667 


Hs.1867 


progastricsin (pepsinogen C) 


7^5 




441738 


AW292779 


Hs.169799 


ESTs 


728 




419991 


AJ000098 


H&34210 


eyes absent (Drosophfla) homolog 1 


720 


20 


425016 


BE245277 


Hs.154196 


E4F transcription factor 1 


720 


424560 


AA158727 


Hs.1 50555 


protein predicted by clone 23733 


7.18 




409110 


AM 91493 


Hs.48778 


niban protein 


7.10 




421566 


KM 000399 


H&1395 


early growth response 2 (Krox-20 (Dfosop 


7.Q4 




431725 


X65724 


H&2839 


Norrte disease (pseudoglbma) 


6.98 


25 


*tdflOC 


U66463 


He 159525 


caH arowth naoulatDrv with EF-hand dome 


6.85 


427408 


AA58320S 


Hs2156 


RAR-related orphan receptor A 


6.79 




435604 


AA625279 


H&268S2 


uncharacterized bone marrow protein BM04 


6.73 








Hs.78893 


KIAAQ244 orotein 


634 




401451 








632 


30 


431778 


AL080276 


Hs268562 


regulator of G-protein signalling 17 


631 


Anarsta 


NM 014781 

1 1 1»l u tut O 1 


H&50421 


KIAAQ203 nana n induct 


630 




431992 


NM 002742 


Hs2891 


□rate in kinase C mu 


6.49 




404253 


AF026692 


H&.105700 

fl* IIW/UU 


secreted frizzted-relatsd orotetn 4 


6.42 
6.41 


35 


416806 


NM 000288 


Hs.79993 


namxtsornal blooenesls factor 7 


6.38 


431858 


X63629 


H&2877 


cadharin 3 tvDS 1. P-cadherin fblaoenta 


6.30 




43d3Sfi 


AF1 n0143 
i%r iuu i*tw 


H&6540 


fibroblast arowth factor 13 


630 




416836 


054745 


H&8Q247 


chalacvstokinin 


630 




433333 


AF034837 


Hs.192731 


douhte-strandad RNA soecific adenosine d 


629 


40 




AW1 62923 


H&25363 


nrasenlUn 2 fAtzhafrnardlsaasa 4^ 


625 


A143A4 


NM (WV&01 


H&75334 




622 




423349 


ATO10258 


Hs.1 27428 


homao box A9 


620 




424800 


AL035588 


Hs.153203 


MvoD fa mil v inhibitor 


6.18 




425451 


AP242769 


Hs.157461 


mesanchvmal stem cell crotein DSC54 


6.14 


45 


447359 


NM 012093 


Hs.1 8268 


fiufiflUtrllA iunnSA 9 


630 


4108B9 

*MU009 


X916S2 


H&J66744 


twist (DresooMa) homotoo farmcenhahs 


537 






NM 006042 


H&48384 


hanaian sulfate fatucosamlnel 3-0-sutfot 


534 




453911 


AW503857 


H&4007 


Sarcolernrnal-assocjatad protein 


534 




408875 


NM 015434 


H&48604 


DKFZP434B168 orotein 


532 


50 


450480 


X82125 


H&25040 


zfac finger protein 239 


530 


451684 


AR216751 


H&26813 

1 hi if W Iv 


CDA14 


538 




400301 


X03635 


Hs.1 657 


astmoen laceotor 1 


5.78 




41DU// 


141607 


Hs334 


fttit<yiQflmIm/l fM-flCPtvft tmnftforflRfl 2 i 


5.74 




418852 


BE537Q37 


H&273294 


hypothetical protein RJ20069 


5.72 


55 


446857 




Hs.1 6349 


KIAA0431 orotein 


5.72 


410232 


AW372451 


HS.61184 


CGl-79 protein 


5.70 




422762 


AU03132O 


Hs.1 19976 


Human DMA sequence from clone RP1-20N2 o 


5.70 




450616 


AL133067 


H&302689 


hypo flietical protein 


5.70 




408621 


AI970672 


Hs.46638 


chromosome 11 open reading frame 8 


535 


60 


439671 


AW162840 


Hs.6641 


ktoesin family member 5C 


534 


410196 


AI936442 


Hs39838 


hypoflieticai protein RJ 10808 


530 




429170 


NM.001394 


KS2359 


dual spectfteity phosphatase 4 


530 




440738 


AI004650 


H&225674 


WD repeat domain 9 


530 




414342 


AA742181 


Hs.75912 


WAAD257 protein 


539 


65 


422634 


NM.016010 


Hs.1 18821 


CGI-62 protein 


536 


400268 






535 




439569 


AW6Q2166 


K&222399 


CEGP1 protein 


531 




452823 


AB012124 


HS30698 


transcription fector-fte 5 (basic helix 


5.48 




431938 


AA938471 


HS54431 


specific granule protein (28 kDa); cyste 


5.44 




427638 


AA406411 


HS208341 


ESTs, Weakly similar to K1AAQ989 protein 


542 
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421264 


AL039123 


Hs 103042 


inicrotubute'flssod&ted protein 1B 


5.38 




421685 


API 89723 


Hs 106778 


ATPasa Ca4-+ transDortina tvoa 2C. memb 


557 




421987 


AI133161 


H&286131 


CGI-101 protein 


558 




422808 


BE314787 


Ks.1581 


olutathlofifl S^transfsrasa thsta 2 


554 


5 


432281 


AK001239 


Ks274263 


hvDOthstical orotein FU10377 


5.32* 




451982 


F13036 


K&27373 


Homo sapiens mRNA; cDNA DKFZp56401763 (f 


552 




444042 


NM 004915 

lift « yUl w 1 w 


Hs.10237 


ATP-blnAng cassette, sub-fem3y Q (WHIT 


551 




447752 


M 73700 

111/ Wf w 


Hs. 105938 


tadflfiBnsffliibi 


5.29 


10 


451418 


RP3877BQ 

UWBI lev 


Hs 96369 


hvpflftaficfll Diotein H .190987 


522 


428593 


AW207440 


Hs. 185973 


degenerative spermstocyte (homotog Droso 


551 




447541 


AK00Q288 


Hs.18800 


hypothetical protein HJ20281 


5.18 




459294 


AW977286 


Hs.17428 


RBPI-fika niotfilri 


5.16 




424692 


AA429834 


1 I9» 1 w 1 IOI 


KIAA0092 nana nroduct 


5.15 


15 


416434 


AW1 63045 


Ks.79334 


njidear facfor Intsrtauldn 3 rftaulatftd 


5.11 


410268 


AA316181 

IVWIw IB 1 


Hsj61635 




5.10 




417517 


AF001176 


H&82238 


P0P4 fDiDGBssIna of nrerursor . S. cersv 


5.10 




453616 

HOOV JU 


NM 003459 


Hs.33846 




5.10 




497058 


AA418000 


HS58280 


pwinajun u u iwiiiiQuiatDf atiivui wiiuuvwuiw 


5.09 


20 




AO0CUO 


Hs.606 


ATPasa Qm transaorting aloha nofunao 


5.08 


418578 




Hs 95O104 


AJu-blndina n rota In wHh zincfinoar dom 


5.05 




413398 


Y15793 


Hfi.75905 


nusnulata rvHaeA 1 enhihln afnhfl 3 


5.04 




<tO£f £9 


AIC0Q09O9 


He 9757*59 


hunflfhoftml nmioin FI^I9Q985 


504 




496349 


AF093419 


Us 150375 


miitHntA PD7 Hnmflfn nmtfitn 


5.02 


25 


429782 


NM 005754 


Hs 520689 


Ras-QTPass-acHvatinn crotatn SH3-domatn 


5.02 




AWR50417 


He 954090 


pCTc Mnrfomtntv ehnllnr tn imnnmflr! nmt 


5.02 






MM 004A55 
l>llV\JUw4000 


He 9471 1ft 


nhnenhnfiriuRnneilnl fllur*an rineeR 


5 XX) 




451 TO6 


AR090006 

MDUC9UUO 


H&96334 


enfleHr rwmnlonlfl 4 faiitnsfVTial rinmtnanf 

0|JBK>UU pOlflUMO^ia "f IGUJIW9VIIWU UWIIUIKMIi 


5X0 




457911 


AW979565 




ESTfi. Wealdv dmftar to S51797 vasodHat 


457 


30 


495551 


NM 0014.00 


He_15Q549 

n* I9904C 


alimfiamtnvl nM-aratvO transfarasa 1 c 


457 


491 coq 


N 87820 


He 106896 

MS. IUOQ6V 


K1AA169B nratfiln 


453 




41fTC33 


RP944053 

DC&44UOO 


He_7Q369 
no.' Bmfi 


mbnobtflstoma-RkB 2 fo130i 


452 




439653 


N69006 


H&993185 


PCTe Weakfv simflar to JC7328 amino ad 


451 




403047 








451 


35 




nrwUOOA 


He 9505n0 


Holla iTlmefinhiloWiira 1 


450 


42/01/ 


U4tU0O 


He 1001 70 




458 






AKfifl0715 


Hftl 03736 
no. 190/ 00 


hvnnftintirfll rvntarn PI .190706 


458 




440071 


MM 005879 


Hs_99fi60 


hraaet flafrinnutt amnlKiad fianuenca 9 


456 




407390 


rlotw 10 




nh<ufi30f05 rl Qitarae fetal Ih/ar enfoon 


454 


40 


4R551A 


BP1 79704 


He 999746 


K1AA1610 nmtein 


454 


450330 
400009 


#\VfO(0000 


H« 179843 


ESTs 


453 




499053 


MM 001141 


He 111955 


anichldnnata 1&4!noxvnflnasa sacond ivo 


452 






W15967 


He 93579 


Inur riane!h/ R nnfimtom raflantnMBlfltad 


452 




4CfcU4o 


NM 019445 


He 985196 


spondtn 2 t extracellular matnx protein 


452 


45 


4Z40UZ 


AK002C55 


He 151045 

ns. 1 o 1 U40 


hvnottwlleal nmtaln R .11 1 183 


4.78 


41(1755 


AI604979 


He 66180 


nurTiaoei^n^a aeeanif^iv fArt^iain 1*iilra 9 


4.77 




41O07Q 


717505 


He 03564 




4.74 




450649 
•www 


NM 001490 


He 95979 


E1 A bindina Droteln d300 


4.74 




41 1694 
41 IQ£4 


RF145QR4 
DCI4O904 


He 1fWM 


KIAA0504 omtnln 


4.72 


50 


404791 








470 


496951 
4&D&D1 


AW949943 

n¥»£4tt40 


He 155570 
no. ivoo/u 


nflmvfeofmt famoevlafari nrntafn 

UXJl WU9WI 1 m IGUIfOOJfMllOU ptUIDUl 


4.70 




416976 

4IO£/U 


1141 060 


Uc 70136 

113.1 O IOO 


1 IV-1 nmtefn estmaan reoutatad 


454 




405374 
4UOOf4 


AW095430 


He 1R55Q1 
na.lOOOal 


(nrkhoarf hny PI 

lUIIUIOCUl UUA 1 1 


4.64 




451900 

43 IS IAS 


AR09310Q 

nOytJ ID9 


He 97907 
nax r £ vf 


KI AA0982 orotein 


4.63 


55 


491437 


AWR91959 

MiV0£ IOC 


He 104335 




453 


. 434629 


AA789081 


Hs.4029 


gtoma-ampfified sequenoe-41 


450 




403764 








458 




421247 


BE391727 


Hs.102910 


general transcnpQon factor 1IH, polype 


453 




403721 








450 


60 


453070 


AK001465 


Ks^1575 


SEC63, endoplasmic reticulum translocon 


4.49 


417412 


X16896 


H&82112 


interleukin 1 receptor, type 1 


AM 




439735 


AI635386 


Hs.142848 


hypothetical protein 


AM 




430261 


AA305127 


Hs\237225 


hypo&etica! protein HT023 


AM 




430598 


AK001764 


H&247112 


hypothetical protein FU109O2 


AM 


65 


400303 


AA242758 


HS.79136 


UV-1 protein, estrogen regulated 


AM 


438209 


AL120659 


H&6111 


aryMiydrocarbon receptor nuclear trans! 


AM 




417421 


AL138201 


H&82120 


nuclear receptor subfamily 4, group A, m 


AM 




447270 


AC0Q2551 


H&331 


general transcription factor (110. polyp 


458 




434423 


NML006769 


H&3844 


UM (tomato only 4 


455 




404561 








452 
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422989 


AA782536 


Hs.122647 


N-niyristoytiiansferase 2 


422 




423685 


BE350494 


H&49753 


uveal autoanfigan with cofled coil domal 


422 




425071 


NM.013989 


Hs.154424 


delodinase, todothyronlne, type II 


422 




431583 


AL042613 


Hs262476 


S-edenosylmethionins decarboxylase 1 


4-31 


5 


442818 


AK001741 


H&8739 


hypothetical protaki FU10879 


420 




423740 


Y07701 


H&293007 


amhopepfidase puromycfn sensitive 


424 




424701 


NM.005923 


Hs.151988 


mltoQen-activated protein kinase kinase 


421 




424085 


NMJ002914 


Hs.139226 


replication rector C (activator 1 ) 2 (40 


420 


10 


410294 


AB014515 


H&323712 


KIAA0615 gene product 


4.18 


447124 


AW976438 


Hs.17428 


RBP1-Hke protein 


4.18 




438018 


AKD01160 


Hs.5999 


hypothetical protein HJ1Q298 


4.16 




443857 


AJ089292 


Hs287621 


hypofiieHcal protein FU14069 


4.15 




446711 


AF1 69682 


Hs, 12450 


protocadherin 9 


4.15 


15 


405403 








4.14 


448148 


NM 016578 


H&2Q509 


HBVpX associated proteh-8 


4.13 




417531 


NMJXJ3157 


H&1087 


serine/threonine kinase 2 


4.12 




433345 


AI581545 


Hs.152982 


hypothetical protein HJ131 17 


4.10 




432712 


AB016247 


Hs288031 


steroWS-desaturase (fungal ERG3, delta 


4.09 


20 


435114 


AA775483 


Hs288936 


mitochondrial ribosomal protein L9 


4.08 


445459 
402791 


AI478629 


Hs.158465 


Gtety ortholog of mouse putative IKK re 


4.08 
4,04 




438660 


U95740 


H&6349 


Homo sapiens, clone 1MAGE3010668, mRNA, 


4.Q4 




447568 


AF155655 


Hs.18885 


CGWOproteln 


424 


25 


452211 


AI9B5513 


H&233420 


ESTs 


4.02 


443292 


AK000213 


H&9196 


hypothetical protein 


AM 




420911 


U77413 


Hs.100293 


(Minted N-acetytgtucosamlna (GlcNAc) tr 


4J00 




428738 


NM_000380 


Hs.192803 


xeroderma pigmentosum, complementation g 


325 




430456 


AA314998 


Hs.241503 


hypothetical protein 


325 


30 


437531 


AI400752 


Hs.112259 


T cell receptor gamma locus 


323 


428695 


M355647 


Hs.189999 


purine rgfc receptor (family A group 5) 


321 




410011 


AB020641 


Hs.57856 


PFTAIRE protein kinase 1 


321 




446494 


AA463276 


H&288906 


WW Domain-Containing Gene 


321 




409928 


AL137163 


H&57549 


hypothetical protein dJ473B4 


3.90 


35 


411598 


BE336654 


Hs.70937 


H3 histone famfiy, member A 


320 


425707 


AF115402 


Hs.11713 


E74-Ske factor 5 (ets domain transcript 


320 




451806 


NML003729 


HS27076 


RNA 3'4erminal phosphate cyclase 


329 




401045 








329 




437372 


AA323968 


H&283631 


hypothetical protein DKFZp547G1 83 


329 


40 


417067 


AJ001417 


H&81086 


solute carrier family 22 (extraneuronal 


328 


410467 


AF102548 


Hs.63931 


dachshund (Drosophlla) homolog 


328 




431930 


AB035301 


K&272211 


cadherm7, type 2 


328 




453047 


AW023793 


H&286Q25 


ESTs 


328 




401785 








328 


45 


458229 


Al 929502 


Hs.177 


phosphafidyQnositol glycan, class H 


326 


406414 






326 




412494 


AL133900 


Hs.792 


ADP-ribosylation factor domain protein 1 


324 




418329 


AW247430 


H&84152 


cystamiHtme4}6ta-synthase 


323 




424850 


AA151057 


H&153498 


chromosome 18 open reading frame 1 


322 


50 


427585 


D31152 


H&179729 


coBagen, type X, alpha 1 (Schmld metaph 


322 


423052 


M28214 


Hs.123072 


RAB3B, member RAS oncogene family 


322 




416111 


AA033813 


Hs.79018 


chromatin assembly factor 1, subunhA ( 


322 




419423 


D26488 


Hs.90315 


KIAA0007 protein 


320 




429843 


AA455889 


Hs, 167279 


FYVE-finger-containing Rab5 effector pro 


320 


55 


431499 


NMJ001514 


H&258581 


general transcription factor IIB 


320 


444078 


BE246919 


Hs.10290 


U5 srflNP-specific 40 kOa protein (hPrp8- 


3.78 




430291 


AV660345 


H&238128 


CG1-49 protein 


3.76 




431637 


AI879330 


K&265960 


hypothetical protein RJ 10563 


3.74 




440411 


N30256 


K&151093 


hypothetical protein DKFZp434G1415 


3.74 


60 


405917 








3.74 


451230 


BB46208 


H&26090 


hypothetical protein RJ20272 


3.73 




429597 


NM-003816 


H&2442 


a (fisintegrin and metailoproteinase dome 


3.73 




415075 


L27479 


Hs.77889 


Friedreich ataxia region gene X123 


3.72 




440351 


AF030933 


Hs.7179 


RAD1(S.pombe) homolog 


3.70 


65 


443603 


BE502601 


Hs.134289 


ESTs, Weakly simaar to WAA1063 protein 


a70 


446965 


BE242873 


Hs.16677 


WD repeat domain 15 


3.70 




412350 


A1659306 


H&73828 


protein tyrosine phosphatase, non-recept 


3.70 




4338S2 


AI37B329 


Hs.126629 


ESTs 


170 




447397 


BE247678 


Hs.18442 


E-1 enzyme 


3.68 




405718 








3.68 
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425217 


AU076696 


Hs.155174 


CDC5 (ceil division eyrie 5, S. pombe, h 


3.68 


421734 


AB18624 


Hs.107444 


Homo sapiens cDNA FU2G562 fis, ctane KA 


3.67 


427221 


L15409 


Hs.174007 


von HfepeHJndau syndrome 


3.67 


402408 








3J68 


452948 


X95425 


H&31092 


EphAS 


3.66 


419078 


M93119 


H&89584 


insuBnoma-assodated 1 


3J66 


427144 


X95097 


H&2126 


vasoactive intestinal peptide receptor 2 


3J65 


423396 


AI382555 


Hs.127950 


broniodomaln-containing 1 


3£5 


446320 


AF126245 


Hs.14791 


acyt-Coenzyme A dshyrfrogertase family, me 


353 


404939 








352 


403137 








350 


437162 


AW005505 


HS5464 


thyroid hormone receptor coacfivafcig pr 


3.60 


404210 








3J59 


443775 


AF291664 


H&204732 


matrix metalloprotelnase 26 


356 


452501 


AB037791 


H&29716 


hypotheScal protein HJ1Q9B0 


356 


422443 


NNL014707 


Hs.116753 


histone deacetytass 7B 


355 


420230 


AL034344 


H&284188 


forkheadboxCI 


3.55 


418428 


Y12490 


Hs55092 


thyroid hormone receptor Heractof 11 


354 


433002 


AF048730 


H&279908 


cydinTI 


353 


405793 








352 


457940 


AL360159 


H&306517 


Homo sapiens TR [partite motif protein ps 


352 


402444 








352 


418250 


U29926 


H&83918 


adenosine monophosphate deaminase (isofo 


3£\ 


414222 


AL135173 


Hs.878 


sorbitol dehydrogenase 


351 


422384 


AA224077 


H&42438 


Sm protein F 


350 


447805 


AW627932 


Hs.19614 


gemin4 


350 


454265 


H03556 


Hs3Q0949 


ESTs, Weakly similar to thyroid hormone 


350 


423445 


NMJH4324 


HS.128749 


aipha-methytacyi-CoA racemase 


3-48 


413435 


X51405 


Hs.75360 


carboxypeptidase E 


346 


447210 


AFD35269 


Hs.17752 


phosphaWytserine-speofc phosphoGpas 


046 


426931 


NMJD03416 


HS2076 


zinc finger protein 7 (KOX 4, clone HF.1 


3.45 


408418 


AW963897 


Hs.44743 


K1AA1435 protein 


3.45 


421887 


AW161450 


Hs.109201 


CGI-86 protein 


3.44 
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Table 7: 42 GENES ENCODING SMALL MOLECULE TARGETS UP-REGULATED IN 
PROSTATE CANCER COMPARED TO NORMAL ADULT TISSUES 

5 Table 7 shows 42 genes up-regulated in prostate cancer compared to normal adult tissues that 
are likely to be small molecule targets. These were selected as for Table 5 and the predicted 
protein contained a structural domain that is indicative of a drugable structure (e.g. protease, 
kinase, phosphatase, receptor). The functional domain is indicated for each gene. 

10 Pkey: Unique Eos prcbeset identifier number 

ExAccru Exemplar Accession number, Genbank accession number 

UnlgenelD: Unigene number 

Unfcene Title: Unigene gene fitte 

PSDomalrc Protein Structural Domain 

15 R1: Ratio of tumor vs. normal tissue 



Pkey ,ExAccn UnlgenelD Unigene Title PSDomaln R1 

20 426747 AA535210 Hs, 171 995 kalUkrein 3, (prostata specific antigen trypsin 3130 

400299 X07730 Hs.171995 kaUikrein 3, (prostate specific antigen trypsin 24.91 

420757 X78592 Hs.99915 androgen receptor (dlhydrctestosterone r Androgeruecep^ormone_rec f zf-C4 19.72 

408430 S79876 H&44926 dipeptidylpepfidase IV (C028, adenosine DPPiV_N_term f Pepfldase_S9 1628 

430228 BE245562 H&2551 adrenergic, beta-2-, receptor, surface 7tmJ 15.40 

25 411096 U80034 H&68583 mitochondrial intermediate peptidase Pepttasejft 1431 

440286 U29589 Hs.7138 cholinergic receptor, muscarinic 3 7tmJ 12j04 

420381 D50640 Hs337616 phosphodiesterase 3B, cGMP-tnrUDited PDEase 11.10 

407021 U52077 gbjHuman marmerl tmnsposase gene, comp SET.TransposaseJ 1 132 

401424 arginase 938 

30 410001 AB041036 Hs37771 kaffikrein 11 trypsin 933 

428330 L22524 Hs.2256 matrix metatoprotelnase 7 (matnTysin, Peptidase JA10 8.76 

424099 ARJ712Q2 H&139336 ATP-Wndmg cassette, sub-famQy C (CFTR ABCJrari f ABCjnernbrane 734 

419991 AJ000098 H&94210 eyes absent (Diosophila) homolog 1 Hydrolase 720 

431992 NMJ002742 H&2891 protein kinase C, mu pkinase.DAQ^PE-blnd^H 6.49 

35 447359 NMJM2093 Hs.18268 adenylate kinase 5 adenyiatekinase 630 

400301 X03635 Hs.1657 estrogen receptor 1 OesLrecep^C4,hormone.rec 5.78 

421685 AF189723 Hs,1 06778 ATPase, Ca++ transporting, type 2C. memb E1-ELATPase,Hydrola$e 537 

444042 WL004915 Hs. 10237 ATP-blnding cassette, sub-family G (WHIT ABCJran 531 

447752 M73700 Hs.105938 iactotransferrin transfenin,7tnx.1 529 

40 407945 X69208 Hs.606 ATPase, Cu++ transporting, alpha porypep El -ELATPase,Hydro!ase,HMA 5.08 

403047 trypsin 431 

427617 D42063 Hs.199179 RAN binding protein 2 Ran.BP1^«lanBP,TPR^Jsomerase 438 

422083 NM.001141 Hs,1 1 1256 arachidonate 15-Bpoxygenase, second typ Epoxygenase.PLAT 432 

449535 W15267 H&23672 low density lipoprotein receptor-related idLmceptJ)JdLraoept_a f EGF 432 

45 425071 NM.013989 Hs.154424 dekxfinase, iodothyronirm, type II T4.detodinase 432 

423740 Y077O1 Hs293007 eminopeptidase puramycin sensitive Peptidase _M1 424 

424701 NMJM5923 Hs.151988 maogen-acfivated protein kinase kinase pkinase 421 

424085 NM.002914 H&139226 replication factor C (activator 1) 2 (40 AAA,ViraLheScase1 420 

417531 NM.003157 Hs.1087 serine/threonine kinase 2 pkinase " 4.12 

50 428695 AI355647 Hs.189999 purlnergic receptor (family A group 5) 7tmJ 331 

410011 AB020641 H&57B56 PFTAIRE protein kinase 1 pkinase 3.91 

424850 AA151Q57 Hs.153498 chromosome 18 open reading frame 1 kOjecepLa 332 

412350 A1559306 Hs.73826 protein tyrosine phosphatase, non-reoept Yjhosphatase^and J1 ,P0Z 3.70 

447397 BE247676 Hs.18442 E-1 enzyme Hydrolase 3.68 

55 452946 X95425 Hs31092 EphA5 EPH_lbd,m3^ldnase l SAM 3.66 

427144 X95097 H&2126 vasoactive intestinal peptide receptor 2 7tm_2 3.65 

443775 AF291664 Hs204732 matrix metaOoprotefnase 26 Peptidase.M10 336 

457940 AL360159 Hs306517 HomosaptensTRtparfiternctif rxot^ SPRY,7tmJ 332 

418250 U29926 Hs33918 adenosine monophosphate deamiriase (isefo deaminase 331 

60 413435 X51405 K&75360 carboxypeptidase E ZrucarbOpept 3.48 

447210 AF035269 Hs. 17752 p^hatidyiserine-spe* phosphofipas lipase 3.46 
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TABLE 8: 136 GENES SIGNIFICANTLY DOWN-REGULATED IN PROSTATE 
CANCER COMPARED TO NORMAL PROSTATE 

Table 8 shows 136 genes significantly down-regulated in prostate cancer compared to normal 
5 prostate . These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip 
array such that the ratio of "average" normal prostate to "average" prostate cancer tissues was 
greater than or equal to 2. The "average" normal prostate level was set to the mean amongst 4 
normal prostate tissues. The "average" prostate cancer level was set to the 85 m percentile 
amongst 73 tumor samples. In order to remove gene-specific background levels of non- 
10 specific hybridization, the 10 th percentile value amongst all the tissues was subtracted from 
both the numerator and the denominator before the ratio was evaluated. 

Pkey: Unique Eos probeset identifier number 
ExAocrc Exemplar Accession number, Genbank accession number 
IS UnigenelD: Unlgene number 
Unigene Tifle: Unigene gene tffle 
R1: Ratio of normal prostate to prostets cancer 



20 



25 



30 



35 



40 



45 



50 



55 



60 



Pkey ExAccn UnigenelD Unigene Title 


R1 


425832 M81650 Hs.1988 semenogelml 


57.69 


425545 N98529 Hs.158295 Human mRNA for myosin light chain 3 (MLC 


19.70 


426752 X69490 Hs.172004 Wn 


1525 


442082 R41823 Hs.7413 ESTs; calsyntenin-2 


10.05 


407245 X80568 Hs.172004 tftin 


938 


422711 D60641 H&21739 Homo sapiens mRNA; cDNA DKFZp586l1518 (1 


9.05 


420813 X51501 Hs.99949 proiactin-induced protein 


8.18 


411987 AA375975 Hs.1 83380 "ESTs, Moderately simflar to ALU7_HUMAN 


745 


404567 


5.62 


416030 H15261 H&21946 ESTs 


551 


444892 AI620617 Hs.148565 ESTs 


527 


444573 AW043590 Hs225Q23 ESTs 


520 


428088 AW016437 H&233462 ESTs 


5J08 


437440 AA846804 Hs.1 23694 ESTs 


435 


404113 


4.75 


452279 AA286844 Hs.61260 hypottieticai protein FU13164 


4.75 


421058 AW297987 H&188161 ESTs 


4.63 


445592 AV654382 Hs.17947 "ESTs, Weakly similar to KQ2F3.10 [C^ete 


453 


405163 


4.49 


405227 


4.45 


454059 NMJD03154HS.37048 sfctherin 


445 


450152 A1138635 H&22968 ESTs 


440 


407013 U35637 "gb:Human nebuCn mRNA, partial cds" 


443 


403612 


4J02 


440089 AA864468 Hs.135646 ESTs 


4J00 


408988 AL1 19844 Hs.49476 Homo sapiens clone TUA8 CrWiKhat regi 


3J88 


436726 AA324975 Hs.1 28993 "ESTs, Weakly similar to K1AA0465 protel 


ass 


459367 BE148877 •gbKamfrQ244-111199^mi2HT0244Hom 


3J95 


427318 AF186081 Hs.175783 zinc transporter 


3J82 


411762 AW860972 "gb:QVO^Tu387-18Q300-167-h07 CT0387 Horn 


335 


418668 AW407987 Hs.87150 Human clone A9A2BR11 (GAC)n/(GTG)n repea 


3.75 


458311 AP069478 "gb:AF069478 Homo sapiens astrocytoma 0 


3.61 


403649 


330 


419682 H13139 Hs.82282 paifed-ifca homeooomain transcription fa 


358 


412519 AA196241 H&73980 "troponin T1 t skeletal, slow" 


351 


414203 AW276887 H&46609 ESTs 


345 


427419 NM.000200HS.177888 Wstatin3 


337 


420777 AA280223 Hs.130865 ESTs 


335 


428134 AA421773 Hs.1 61 008 ESTs 


331 


450218 R02018 Hs.1 68640 "Ank, mouse, homotog of 


330 


433474 AI192195 Hs.147174 "EST, Highly similar to ublquffiniJrotei 


3.30 


418833 AW974899 H&292778 ESTs 


326 


400440 X83957 Hs.83870 nebufin 


3.16 
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413778 AA09Q235 Hs.75535 •myosin, light polypeptide 2, regulatory 3.06 

423151 AW838068 *gb«V3lTD04W)10300-1(»-lD2 LTO048 Horn 3.05 

445060 AAB3Q811 H&88808 ESTs 2*98 

457085 A1476318 Hs.192480 ESTs 2X5 

5 432456 K00093 "gb:ph8f12u_19/1TV Outward Akhpnmed hn 2X2 

405678 2X5 

406707 $73840 HsX31 •myosin, heavy polypeptide 2, skeletal m 2X1 

444105 AW189097 Hs.1 66597 ESTs 2.78 

433968 AL157518 KSX0421 PR02463 protein 2.73 

10 438522 AA809431 H&258886 ESTs 2.73 

436562 H71937 Hs.169756 "complement component 1 , s subcomponenf 2X8 

412417 AA102268 Hs42175 ESTs 2.67 

455590 BE072259 a gb^V4^BT0538^71299^g04 BT0536 Horn 2X5 

415380 P07953 Hs. 16085 putative G-protein coated receptor 2X5 

IS 428729 AL162331 Hs.191436 hypothetical protein FU10619 2X4 

408537 AW207734 "gb:UW«l2«ge4H)1-0-UI.8l NCLCGAP-S 2.63 

424706 AA741336 Hs.152108 transcriptional unit N143 2X3 

413212 BB072092 lgbi>ft^T0532-1W200^034>11 BTOSffiHom 2X3 

406704 M21665 HsX29 ■myosin, heavy polypeptide 7, cardiac mu 2X2 

20 437507 AA758538 H&246B82 ESTs 2X0 

410384 AI933784 H&42745 ESTs 2X8 

408074 R20723 Hs.124764 ESTs 2X8 

436653 AA829828 Hs292402 ESTs 2X2 

458090 A1282149 KsX6213 "ESTs, Highly similar to FXD3_HUMAN FORK 2X1 

25 432003 A1689154 Hs. 122972 ESTs 2X0 

436915 AA737400 Hs.142230 ESTs 2X0 

410028 AW576454 H&258553 ESTs 2.46 

448920 AW408009 H&22580 aikytgrycerone phosphate synthase 2.45 

422046 A1638562 "gbis50a10J(1 NCLCGAPJJH Homo sapiens ZM 

30 451122 AA015767 Hs.183587 ESTs 2-40 

422646 H87863 Hs.151380 ESTs 2X6 

451237 AW60Q293 "gb£ST00049 pGEM-T library Homo sapiens 2X6 

400001 AFFX controh BioB-3 2X6 

415835 245365 "gb:HSC2NF061 normalized infant brain cO 2X6 

35 439706 AWB72527 HsX9761 ESTs 2X6 

423341 AW242394 Ha252495 ESTs 2X6 

436486 AA742221 Hs. 120633 ESTs 2X5 

407449 AJ0Q2784 gbiHomo sapiens mRNA; fetal brabi cONA 5 2X3 

430573 AA744550 Hs.136345 ESTs 2X2 

40 401S74 2X1 

443358 AL044498 Hs.133262 "ESTs, WeaWy similar to PH0217 reverse 2X1 

430751 NMJ>12471H$247868 transient receptor potential channel 5 225 

439128 AI949371 Hs.153089 ESTs 225 

448765 R15337 Hs21958 'Homo sapiens cOMA RJ 10532 fis, done N 225 

45 451130 AI762250 H&211347 ESTs 224 

405420 223 

455029 AW851258 a p>:IL3^0220-160200-066-H06 CT0220 Horn 223 

438224 AA933999 B gb:on91fl)4.s1 Soares_NFU_T_GBC_S1 Homo 223 

407764 BG008347 "gb^MOBN0154^80400-325^04 BN0154 Horn 223 

50 413549 BE252470 "gb£01108292F1 NIH_MGC J6 Homo sapiens 223 

437010 AA741368 Hs291434 ESTs 223 

435111 A1914279 Hs213740 ESTs 222 

403375 221 

455060 AW853441 -gb:RC1-CT0252-0301C)0-023-g09 CT0252 Horn 221 

55 409792 AW854153 "p>^C3-CTO254-060400^029-d03 CT0254 Horn 220 

421154 AA284333 Hs287631 "Homo sapiens cONA RJ14269 lis, clone P 2.19 

401963 2.18 

4350X4 AF1 68711 Hs.1 59397 x 010 protein 2.18 

448998 AW998S89 H&1Q5749 WAA0553 protein 2.18 

60 436816 AW297599 Hs255667 ESTs 2.17 

442252 AI733395 Hs.1 29124 ESTs 2.17 

419310 AA238233 Hs.188716 ESTs 2.16 

418579 K91800 Hs.124156 ESTs 2.16 

423315 R54109 Hs26096 ESTs 2.16 

65 432744 AA988835 KsX8664 ESTs 2.15 

424492 AI133482 Hs.165210 ESTs 2.15 

424770 AA425562 "gb2w46e05/1 Soares_totaUetus_Nb2HF8 2.15 

437101 AA744518 Hs.120610 ESTs 2.15 

428783 AC004957 H&288975 "ESTs, Highly similar to coBapsfrv2-fk 2.15 
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415708 H56475 "Sb:yt87d11.r1 Soaresj>tra^landJJ3HPG 2.13 

459619 2.12 

427506 AK000134 Hs.179100 hypothetical protein FU20127 2.12 

452508 AA804174 Hs.184354 ESTs 2.10 

5 410881 AW809157 "gtoRCO-STOI 18-041099-031-C07J ST0118 Homo sapiens cDNA, mRNA sequence" 2.10 

403087 2.10 

403869 2.10 

445028 D81194 Hs282499 ESTs 2.10 

447884 H29505 *gb^m60d10ji Soares Infant brain 1NIB Homo sapiens cDNA clone 5*, mRNA sequence" 2.10 

10 414575 H11257 Hs.295233 ESTs 2JJ9 

420351 BE218221 Hs.190044 ESTs 2.08 

426998 BE274360 "gb:601 121088F1 MHJAGCJO Homo sapiens cDNA clone 5\ mRNA sequence" 2.08 

405455 2.08 

423843 AA332652 "gb£ST36627 Embryo, 8 week I Homo sapiens cDNA 5 1 end similar to similar to 

15 monoamine oxidase B, mRNA sequence" 2.03 

406135 2.07 

427046 BE246180 Hs.121385 ESTs 2.07 

403493 2.05 

444514 A!6829Q5 Hs.270431 "ESTs, WeaWy similar to ALU1.HUMAN ALU SUBFAMILY J SEQUENCE 

20 CONTAMINATION WARNING ENTRY [H^piensf 2.05 

435884 AA701443 Hs.192868 ESTs 2.05 

419629 AB020695 Hs.91662 KIAA0888 protein 2.03 

405900 ~ 2.03 

457350 AW974438 Hs.194136 "ESTs, Moderately simitar to AF091457 1 zinc finger protein RIN ZF [FLnorvegicusr 2.02 

25 400007 AFFX control: BtaDn-5 2.01 

406978 M64358 "gb:Hurnan rhorn-3 gene, axon." 2.00 
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TABLE 8A shows the accession numbers for those primekeys lacking a unigenelD in Table 
8. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from GenbankESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 

Pkey. Unique Eos probeset identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accessions 



407764 1014849.1 

408537 1064753J 

409782 1154677.1 

410881 1225682J 

411762 1256906 J 

413212 1353792J 

413549 1375933.2 

415708 1548209J 

415835 1558511J 

422046 210744 J 

423151 225415J 

423843 232510J 

424770 243504 J 

426998 274259.-1 

432458 347718J> 
AW905210 

438224 452656.1 

447884 740749 J 

451237 863269.1 

455029 1249374.1 

455060 1251259.1 

455590 1335127 J 

458311 543550.1 



BE008347 BE008320 BE083307 BB08331 1 AW075968 
AW207734 D60164 D81150 D81078 D61356 AW996804 
AW854153 AW500210 BE145772 AW501310 
AW809157 AW812181 AW812175 AW812172 AW812161 AW812165 

AW860972 AW862598 AW862599 AW860988 AW860983 AW860898 AW860925 AW860922 AW850986 AW860984 AW860989 

BE072092 BE072106 BE072086 BE072098 BE072103 

BE252470 BE147573 

H56475F29401F34552 

245365 R25905 H05203 T77496 

AI638562 T16929 H13401 F07773 R55838 

AW838068 AW837986 AW838067 AA322487 AW837936 

AA332652 AA331633 AW999369 AW902993 BE170475 AA378845 AW964175 A1475221 
AA425562 AI880208 AA346646 N22655 AW81 1775 AW811786 



H00093 K00079 H00070 K00054 H00049 H00063 AW905306 AW905241 AW90541 0 AW905307 AW90541 1 AW905240 

AW905352 AWB05304 AW905239 AW905242 AW905243 H00087 
AA933999AA781181 

H29505 R18575 Z43580 T48738 AI435454 BE004683 
AW600293AI767468 

AW851258AWB51435AW851106AW651421 
AW853441 BE145228 BE145218 BE145162 BE145283 
BE072259 BE072230 BB007911 
AF06947B AF069479 AFD69480 
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TABLE 8B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in table 8. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed 



Pkey: Unique number corresponding to an Eos probeset 

Ret Sequence sourca The 7 dig& numbers tn this column are Genbank Identifier (Gl) numbers, "Dunham I. et al" refers to the 

publication entitled The DNA 

sequence of human chromosome 22." Dunham LeiaL, Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

Imposition: Indicates nucleotide positions of predicted exons. 



Ptey 


Re! 


Strand 


Ntposffion 


401963 


3126783 


Plus 


51382-51521 


401974 


3126777 


Plus 


8533045683 


403087 


8954241 


Plus 


169511-169795 


403375 


9255944 


Minus 


82554-92795 


403493 


7341425 


Plus 


157568-159084 


403612 


8469060 


Minus 


94723-94859 


403649 


8705159 


Minus 


27141-27247 


403869 


7280046 


Minus 


3437944583 


404113 


9588571 


Minus 


13446-13646 


404567 


7249169 


Minus 


101320-101501 


405163 


9966267 


Minus 


161171-161299 


405227 


6731245 


Minus 


22550-22802 


405420 


7211837 


Minus 


13428-13582 


405455 


7656675 


Plus 


134112-134671 


405678 


4078670 


Pius 


151821-152027 


405900 


6758795 


Minus 


71181-71535 


406135 


9164918 


Minus 


6548945715 
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45 



50 



55 



60 



TABLE 9: 1001 GENES SIGNIFICANTLY UP-REGULATED IN NORMAL PROSTATE 
COMPATED TO PROSTATE CANCER 

Table 9 shows 1001 genes significantly up-regulated in prostate cancer compared to normal 
prostate. These were selected from 59680 probesets on the Affymetrix/Eos Hu03 GeneChip 
array such that the ratio of "average" normal prostate to "average" prostate cancer tissues was 
greater than or equal to 8.14. The "average" normal prostate level was set to the mean 
amongst 4 normal prostate tissues. The "average" prostate cancer level was set to the 85 th 
percentile amongst 73 tumor samples. In order to remove gene-specific background levels of 
non-specific hybridization, the 10 th percentile value amongst all the tissues was subtracted 
from both the numerator and the denominator before the ratio was evaluated. 



Ptey: 
ExAccrc 
UnbenelD: 
Unlgene Title: 
R1: 



Ptey ExAccn UnlgenelD UnigeneTiUe 



10 



15 



20 



25 



427906 AAB64330 
443685 

™ 451554 
30 418323 

429480 



Unique Eos probeset Identifier number 
Exemplar Accession number, Qenbank 
Unigene number 
Unlgene gene tffle 

Ratio of prostate cancer to normal prostate 



451002 AA013299 



443576 AI0780Z7 
434247 AA928116 
400452 AK000165 



AI474868 
NM_002118 



418917 
404407 
35 442027 
433704 



40 



AW138330 
XQ2894 



415354 
424239 
444143 
401672 



411972 
448992 
408828 
409653 
402964 
422673 
422568 
438907 
405172 
444897 
458019 
405275 
457815 
424385 
407172 



AA608684 

U83527 

F06495 

M67439 

AW747996 

AW383947 

BE074959 

AI766053 

BE540279 

AW451693 

N59027 

AA372275 

R32704 

AW137088 
AW592931 



Hs2018 ESTs, WeaWy similar to ALU3_HUMAN ALU 8 
Hs.188999 ESTs 
Hs.169338 ESTs 
Hs272065 ESTs 

gb:Homo sapiens cDNA FU20178 fis, clone 

HS.166520 ESTs 
Hs.174481 ESTs 
Hs.183237 ESTs 

Hs.1162 major histocompattoifty complex, class 
HSJ9295 elaslin (supravaivular aortic stenosis, 
H&233778 ESTs 
Hs.1217 adenosine deaminase 

Hs.128395 ESTs 

Hs.121705 ESTs, Moderately similar to ALUC_HUMAN 1 
gb:HSU83527 Human fetal brain (Mloveti) 
gb:HSC1AB051 normalized infant brain cDN 

Hs.143526 dopamine receptor D5 

Hs.160999 ESTs 

H&2463B1 C068an6gen 

gbJP M&eTQ582-M0100-001-t08 BT0582 Homo 
Hs.188346 ESTs 

gb£01059857F1 NlH_MGCJ0 Homo sapiens c 
H&220826 ESTs 

gb.-yv59d1 1 .rl Soares fetal Bver spleen 
Hs279800 Homo sapiens cONA RJ1 1383 fis, clone HE 
H&301298 ESTs 



435672 
420283 
417016 



AA703679 

AA339666 

T54095 

AA424163 

AI700148 

AA485224 

AA837098 

AF074994 



Hs.144857 
Hs256298 
Hs28500 
Hs.106999 



Hs.156895 

H&283626 

Hs27734 

Hs-269933 

H&24240 



ESTs 
ESTs 

mitogarvficthrated protein kinase 8 inter 
ESTs, Weakly similar to S YT5 JflJMAN SYNAP 
gb£ST44776 Fetal brain I Homo sapiens c 
gb:ya92c05.s1 Stratagene placenta (93722 
ESTs 
ESTs 

G protem-coupted receptor tinase-intera 

ESTs 

ESTs 



R1 

1684.00 

73820 

24626 

24520 

221X10 

221.33 

212.00 

163.20 

149X5 

126.11 

12327 

120.00 

106.75 

10571 

10053 

8420 

89.18 

87.73 



86.43 
7726 
6847 
6820 
6126 
57J1 
5640 
54.67 
5420 
54.00 



5226 
5222 
51J63 
5028 
4920 
4820 
4728 
4623 
4357 
4320 
42.70 
4227 
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406134 4243 

457319 AA480895 Hs201552 ESTs, Weakly similar to T17288 hypotheti 42.31 

409314 AA07Q266 gb:zm69d04j1 Stratagene neuroepffliefium 4225 

401124 41J61 

S 428316 A1371157 Hs.178538 ESTs 4O00 

420317 AB006626 Hs58485 K1AA0290 protein 39j64 

457588 AW062439 gb^Rt>CT0060-12089^001-f08 CT0060 Homo 3950 

417407 AA923278 HS28Q905 ESTs, Weakly similar to protease (H^api 38.73 

430269 BE221582 Hs.178364 ESTs 3856 

10 439602 W79114 H&58558 ESTs 36j69 

433688 AA6M799 Hs.136528 ESTs, Moderately similar to ALU1JWMAN A 3629 

417993 AW963705 Hs295806 ESTs, WeaMy sinfe to ALU7JJUMAN ALUS 36.18 

426214 AA936282 Hs. 120397 ESTs 36.10 

416908 AA333990 Hs50424 coagulation factor Xiit, A1 polypeptide 3858 

15 426264 BE314852 Hs.1 68694 hypothetical protein FU1 0257 36.00 

415911 H08798 Hs.1 24952 ESTs 36.00 

457502 AA076049 Hs274415 Homo sapiens cDNA FU1Q229 fis, done HE 3523 

421566 NWL000399 Hs.1395 early growth response 2 (Krox-20 (Droscp 3520 

401468 3489 

20 458561 AI220150 Hs211 195 ESTs 3450 

433601 BE350738 Hs.1 23993 ESTs, WeaWy similar to T00366 hypothefl 3324 

454977 AW848032 gb:M$CT021 4-231 299-053O1 1 CT0214 Homo 3256 

402828 3253 

414522 AW518944 Hs.76325 Homo sapiens cDNA: FU23125 fis, dona L 31.76 

25 402842 3158 

421245 AA285363 Qb;HTH280 HTCOL1 Homo sapiens cDNA 512 3159 

401631 F05183 Hs.1799 COIDanUgen.d polypeptide 3126 

408057 AW139565 gb;UWfBI1-aea-(W4-f>ULs1 NCLCGAP.Su 3124 

408069 H81795 gb:ys68a10Jl Scares retina N2b4HR Homo 3120 

30 438694 T87479 Hs291797 ESTs 3159 

449156 AF103907 Hs.1 71 353 prostate cancer antigen 3 29.78 

428796 AU076734 Hs.193665 sotute earner family 28 (sodium-coupled 29.76 

452549 AI907039 gb:PM-BT134-Q20499-566 BT134 Homo sapien 2959 

410129 BE244074 Ks285531 regulator of Fas-induced apoptosis 2953 

35 414464 AI870175 Hs.13957 ESTs 29.47 

412326 R07566 Hs.73817 Small Inducible cytokine A3 (homologous 2922 

459081 W07808 gb2b03a12Jl Soares_fetaLlung_NbHL19W 2920 

448702 AW102670 Hs.122464 ESTs 29.13 

451939 U80456 Hs27311 single-minded (Drosophlia) homolog 2 28.74 

40 443412 W84893 Hs5305 angiotensin receptor-fike 1 2851 

457324 AB028990 Hs243901 K1AA1067 protein 2824 

424247 X14008 Hs234734 rysozyme (renal amyloidosis) 28.18 

457140 A1279960 Hs.178140 ESTs 28.12 

444151 AW972917 Hs.128749 aipha-methylacy^CoA racemase 2856 

45 457669 A W1 04257 Hs.123426 ESTs, WeaWy similar to putative serine/ 2751 

412429 AV65Q262 Hs.75765 GR02 oncogene 2756 

405495 2753 

406518 2725 

407997 AW135429 Hs243577 ESTs 2656 

50 442115 AW452332 Hs257554 ESTs 2656 

409038 T97490 Hs500Q2 small Inducible cytokine subfamily A (Cy 2654 

402838 2652 

449846 AI979284 Hs200552 ESTs " 2621 

417153 X57010 HS51343 collagen, type II. alpha 1 (primary oste 2620 

55 439792 NM.014858 Hs.6684 WAA0476 gene product 2551 

450098 A1682088 Hs223368 ESTs 2550 

424186 AL133860 Hs.142928 Homo sapiens mRNA; cDNA DKFZjp434M0927 (f 2557 

414246 BE391090 HS280278 EST 2557 

420848 NM.005188 Hs59980 Cas-6r-M (murine) ecotropte retroviral t 2548 

60 424778 AA251048 Hs.1 53042 lymphocyte antigen 9 2542 

409126 AA063426 gb2f70c085l Soaresj)heaLgland_N3HPQ 2525 

443936 AW083491 Hs51196 ESTs 2522 

419392 W28573 gb51f10 Human retina cDNA randomly prim 2551 

411201 T74588 Hs5509 ESTs, Weatfy similar to C03_HUMAN COMPLE 2455 

65 422940 BE077458 gb:RC1-8T060W)90500-015-b04 BT0606 Homo 24.78 

437571 AA760894 Hs.153023 ESTs 24.74 

433973 AI014723 Hs.131770 ESTs 2457 

422416 BB019557 Hs.1 1900 Human DNA sequence from done RP4^B3P15 2453 

421552 AF026692 Hs.1 05700 secreted frbztefkeJaled protein 4 2449 
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443568 U25758 H&134584 ESTs 2449 

424800 AL035588 Hs.153203 MyoD family tnhSte 24.10 

453633 AA357001 Hs.34045 hypothetical protein RJ20764 24X4 

430565 AL122081 K&244343 cadherin related 23 24J0O 

5 433694 AI208611 Hs.12068 Homo sapiens cONA FU1 1720 f2s» dona HE 23X9 

451045 AA215672 gbzr96e09.s1 NCLCGAP_GCB1 Homo sapiens 23j63 

408583 AW449674 Hs.47359 ESTs 23.73 

444040 AF204231 Hs, 182982 golgm-67 23.62 

414182 AA1 36301 gb2k93g04.s1 SoamsjwegnanLutenjs.NbH 2139 

10 418678 NM_001327 Hs.167379 cancertesfc antigen 2320 

408380 AF123050 Hs.44532 diubfajuffin 22J68 

456076 BE243877 Hs.76941 ATPase, NaWK+ transporting, beta 3 poty 22X5 

418299 AA279530 Hs.83968 bitegrin, beta 2 (antigen CD18 (p95), ty 22X8 

444917 RS8651 Hs.144997 ESTs 2226 

15 444381 BE387335 K&283713 ESTs 22XB 

415788 AW628686 Hs.78851 KIAA0217 protein 22X4 

410896 AW809637 gb:MR4-ST0124-261099415-b07 ST0124 Homo 22j00 

412978 AI431708 H&820 homeoboxC6 2U95 

458418 AV653846 Hs. 126261 Homo sapiens Chromosome 1 6 BAC done CfT 21.94 

20 454791 BE071874 gfcRC*BT0522-120200-014-aQ6 BT0522 Homo 21-84 

408748 J05500 Hs.47431 spectrin, beta, erythrocytic (includes s 2126 

416011 H14487 gb:ym1 8c1 Oil Scares Infant brain 1 NIB H 2124 

440474 A1207936 Hs.7195 gamma-aminobutyric add (GABA) A recepto 21.14 

447047 AI623698 Hs.246306 Homo sapiens cONA: FU23529 Bs, clone L 21.11 

25 426793 X89887 Hs.172350 HIR (hlstone cell cycle regulation defec 21.10 

409841 AW502139 gb:UI-HF^ROp-fijr-e-05^HJlJl NIH.MGCJ5 21.07 

405685 2CL90 

457359 A1983207 Hs.192481 ESTs, Weakfy simQar to SYPHHUMAN SYNAP 2034 

423067 AA321355 Hs285401 ESTs 20.74 

30 422355 AW403724 H&140 krammogtobuEn heavy constant gamma 3 (Q 20.73 

401201 20.73 

458278 W28912 Hs.129019 ESTs 2038 

439097 H66948 gb.*yr88d1 OjI Soares fetal Kver spleen 20X7 

414875 H42679 Hs.77522 major histocompatibility complex, dass 20X6 

35 400926 20X6 

451355 NM 004197 H&444 serine/threonine kinase 19 20.64 

446982 AW500221 Hs.43616 Homo sapiens mRNA for FU00Q29 protein, 20X1 

417105 X60992 HsX1226 C06 antigen 20X1 

405777 20X1 

40 424123 AW966158 HsX8582 Homo sapiens cDNA RJ12702 6s, done NT 2020 

425009 X58288 Hs.154151 protein tyrosine phosphatase, receptor t 20.10 

443271 BB88568 Hs.195704 ESTs 19X8 

421064 A1245432 Hs.101382 tumor necrosis factor, alpha-induced pro 19X8 

418819 AA228776 Hs.191721 ESTs 19X4 

45 457595 AA584854 gbmo09h11.s1 NCI_CGAP_Phe1 Homo sapiens 19X0 

404426 19-84 

412571 U43143 Hs.74049 frns-reiated tyrosine kinase 4 19.79 

431457 NM.01221 1 Hs256297 integrin, alpha 1 1 19.62 

414002 NM.006732 Hs.75678 FBJ murine osteosarcoma virai oncogene h 19X7 

50 418994 AA296520 HsX9546 Selecfin E (endothelial adhesion motecut 19X8 

437158 AW090198 K&4779 WAA1 150 protein 19X2 

437866 AA156781 HsX3992 ESTs 19.44 

417421 AL1 38201 HsX2120 nudear receptor sutrfamiry4 ( group A,m - 19X4 

433057 X15675 Hs296832 Human pTR7 mRNA for repetitive sequence 1922 

55 421730 AW449808 Hs.1 64036 glucosamine (N-acetyr>6-suIfatase (Sanf 1921 

456557 AA264477 HSX6618 ESTs 18.77 

440806 AI247422 Hs.1 29966 ESTs 18.76 

439845 AL355743 HsX6663 Homo sapiens EST from clone 4121 4, full 18.65 

416155 AI807264 Hs205442 ESTs, Weakfy simDar to AF1 17610 1 1nner 18X4 

60 437820 AA769062 Hs.16029 ESTs, Weakfy similar to aJtemafivery sp 18X2 

450923 AW043951 HsX8449 ESTs 18X9 

418329 AW247430 HsX4152 cystamior»Ine-ceta-synmase 18X8 

424537 Al 673027 Hs.1 43271 ESTs 18X5 

447742 AF113925 Hs.1 9405 caspase recruitmam domain 4 18X2 

65 415251 R42863 Hs.7124 ESTs 1847 

440770 AA912815 HS222078 ESTs 1840 

407711 A1085846 Hs 25522 ESTs 18X2 

427157 U51166 Hs.1 73824 thyrrurie^NA gjycc5y!ase 1828 

409847 AW501751 HS279733 ESTs 18.15 

176 



WO 02/30268 



PCT/US01/32045 



417240 N57568 
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449214 
433867 
431735 
401515 
444045 
442754 
426559 
432415 
42788) 
432516 
435259 
414989 



417651 
453457 



AI971131 
AL050102 
AI889114 
AK000596 
AW977724 

A1097439 

AL045825 

AB001914 

T16971 

AI188225 

RO80O3 

AA152106 

TB1668 

AW1 18683 

R06874 

AL037103 



419078 
417698 
431117 
455254 
425782 
426678 
426403 
425905 



420940 
459234 
404756 
422247 



443559 
438703 
411424 
402895 
422538 
447108 
448520 
438567 
407811 
410721 
437133 
408182 
417315 
431840 



M93119 

BE241624 

AFD03522 

AW877015 

U66468 

H08170 

NM.000361 

AB032859 

AW451157 

AA830664 

A1940425 

U18244 
F09247 
Al 076765 
AI803373 
AW845985 

NM.006441 
AW449602 
AB002367 
AW451955 
A W1 90902 



418277 



420120 
428597 
447033 
421684 



AB018319 

AA047854 

AI080042 

AA534908 

AA847856 

AW135221 

AW796342 

AL049610 

NNL003816 

AI357412 

BE281591 

AA055800 

AV656098 

AA076769 



426108 AA622037 
416208 AW291168 
410708 AA534370 
447342 AI199268 
454563 AW807530 
411507 AW850140 
438170 A1916685 
416292 AA179233 



446012 
409671 



Hs.176028 EST 

Hs.123138 leucine rich repeat and dealh domain con 

H&278615 ESTs 

H&276770 CDW52 antigen (CAMPATH-1 antigen) 

H&293684 ESTs, Weakly similar ta alternatively sp 

Hs2272Q9 DKFZP586F1 01 9 protein 

Hs.195663 ESTs 

H&3818 hlppocaton-Uke 1 

Hs.75968 thymosin, beta 4, X chromosome 

Ks.135548 ESTs 
Hs210197 ESTs 

Hs.170414 paired basic amino add cleaving system 
HS289014 ESTs 
Hs.127462 ESTS 
Hs.188013 ESTs 
Hs.4859 cycCnLania-6a 

pjb7d29c04/1 Soares fetal Over spleen 
Ks.154150 ESTs 
H&268S28 ESTs 

H&270599 ESTs, Weakly similar to unnamed proteto 

Hs.143604 Kaiso 

Hs59584 insuDnoma-assodated 1 

H&82401 C069 antigen (p60, early T-ceO acfivafi 

H&250500 delta prosophnaVBkel 

gb^V2-PT0010-25030(W)9W12 PT0010 Homo 
Hs.159525 cell growth regulatory with EF-hand doma 
Hs.1 13755 ESTs 
Hs2030 thrombomodulin 
Hs.161700 K1AA1 133 protein 
Hs.181157 ESTs 
Hs.143974 ESTs 

O^:CM0^T(X)52-150799^4-c04 CT0052 Homo 

Hs.1 13602 solute carrier family 1 (high affinity a 
Hs.167399 protocadherinalphaS 
Hs269899 ESTs 
H&31599 ESTs 

gbflC2-CTO163-20O99^OQ2-H08 CT0163 Homo 

Hs.1 18131 5,10HTietherryltetr8iwarolotate synthetase 
Hs217953 ESTs, Moderately similar to NK-TUMOR REC 
H&21355 doubiecortin and CaM kinese-ilke 1 
Hs.153065 ESTs 

Hs.40098 cysteine knot superfamBy 1, BMP antagon 
H&2730 heterogeneous nuclear ribonudeoproteln 
H&5460 " KIAA0776 protein 

gb:zf49g04 jl Soares retina N2b4HR Homo 
Hs.1 80450 ribosomai protein S24 
H&2880 POU domain, dass 5, transcription facto 
Hs.124565 ESTs 
Hs.1 30312 ESTs 

gb:P M2-UM0027'23020CK)02-h02 UM0027 Homo 
H&95243 transcripfion elongation factor A (SIJ> 
Hs2442 a disintegrin and metaDoprotelnase doma 
Hs.157601 EST-nottnUnlQene 
Hs.106768 hypotheticai protein FU1G511 
H&222933 ESTs 

Hs. 172382 hypothetical protein FU20001 

gb:7BQ2B10 Chromosome 7 Fetal Brain cDNA 

Hs.1 66468 programmed cell death 5 
Hs.41295 ESTs 

Hs.1 54088 Homo sapiens cDNA: FU22756 fis, done K 
H&19322 ESTs; Weakly similar to ffll ALU SUBFAMI 

gbKMVST0081-130999^WQ2 ST0081 Homo 
gb:tl^CTQ219-261099O234)11 CT0219 Homo 
Hs.194601 ESTS 

nasopharyngeal carcinoma susceptibility 
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18.13 
18.12 
18.12 
17.90 
17^2 
1752 
17.75 
17.72 
17.71 
17.67 
1758 
1755 
1754 
1750 
1750 
1744 
1756 
1751 
1750 
1727 
1722 
1722 
17.18 
17.14 
17.14 
17.14 
17.12 
17.12 
17J01 
17J0O 
1658 
1654 
1652 
1651 
1650 
1658 
1650 
1678 
16.70 
1659 
1658 
1655 
1654 
1652 
1650 
1650 
16.40 
1852 
1650 
1628 
1620 
1659 

• 1654 
1654 
1652 
1652 
1554 
. 1553 
1558 
1555 

. 1554 
1554 
1548 
1542 
1558 
1557 
1556 
1529 
1526 
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406638 


M13861 


446686 


AW138043 


434485 


AI623511 


441188 


AW292S30 


A aa 


DdAT7Af\ 

DC14/74U 


409521 


BE244654 


420748 


AA279956 


422583 


AA410506 


424240 


ABQ23185 


451118 


AI862098 


437495 


BE177778 


445467 


A1239832 


418305 


AWD06783 


402812 




436851 


AA732480 


400991 




415752 


BE314524 


429800 


AA460421 


403683 
430315 


NM.004293 


451952 


AL120173 


424687 


J05070 


447229 


BE517135 


425818 


ABQ2122S 


448553 


A1638449 


431089 


BE041395 


459145 


AI903354 


449650 


AF055575 


400952 




445885 


AI734009 


407938 


AA905097 


431676 


A1685464 


437210 


AA311443 


451900 


AB023199 


445800 


AA126419 


412368 


AW945992 


409055 


AW304Q28 


408763 


W57550 


446734 


AL04S278 


413551 


BE242639 


421913 


AJ934365 


452712 


AWB38616 


451468 


AW503398 


406038 


Y14443 


424909 


S78187 


434078 


AW880709 


415254 


AIB15831 


418196 


AI745649 


410020 


TB6315 


411352 


NMJM2890 


429848 


AF145439 


413729 


BE159999 


400125 




420319 


AW406289 


448272 


A1479094 


422695 


AA315158 


424565 


AW102723 


458048 


H30340 


408894 


A1935400 


454093 


AWB60158 


410889 


X91662 


457751 


AI908238 


455131 


AW857913 


408364 


AW015238 


425907 


AA365752 


402359 




401044 




409877 


AW502498 


423690 


AA329648 



gb:Human T-ceH receptor active beta-cha 
Hs.156307 ESTs 
Hs.118567 ESTs 
Hs255609 ESTs 
Hs.104558 ESTs 

Ha. 159578 Homo sapiens mRNA for FU00Q20 protein, 
H&88672 ESTs 

Hs.1 18578 H^apfens mRNA for ribosomal protein L18 
Hs.1 43535 caicLirn/calrno<Iunn-dapendBn} protein (dn 
Hs.60640 ESTs 

gb:RC1-HT0598-31030O012-f07 HT0598 Homo 
Hs.15617 ESTs, WeaJdy similar to ALU4_HUMAN ALUS 
HsJSm ESTs 

H&283581 ESTs 

Hs.78776 Human putaiwe transmembrane protein (nm 
H&30875 ESTs 

H&239147 guanine deaminase 
H&201663 ESTs 

Hs.1 51 738 matrix metaBoproteinase 9 (gefatmase B 

gbS01441677F1 NIH_MQC_65 Homo sapiens c 
Hs.1 59581 matrix metalloproteinase 1 7 (membrane4n 
Hs.173031 ESTs 

Hs283676 ESTs, Weakly similar to unknown protein 

gb:ROBT029-1 00199-1 17 BT029 Homo sapien 
H&297647 ESTs, Moderately similar to calcium chan 



Hs.127699 

Hs.85050 

H&292638 

H&293563 

Hs27207 

H&301632 

Hs.181125 

Hs200578 

H&301526 

Hs.16074 

Hs.75425 

Hs.109439 

H&210047 

H&88219 

Hs.1 53752 

Hs283683 

Hs.184378 

H&26549 

Hs.728 

Ha758 

H&225946 



Hs26593 
Hs.170786 

Hs.75295 

Hs.173705 

Hs217286 

H&66744 



Hs.128453 
Hs, 155965 



EST cluster (not fn UnJQene) 



ESTs 

Homo sapiens mRNA; cDNA DKFZp586E2317 (f 

WAA0982 protein 

ESTs 

Irnrmmogbbuln lambda tocus 
ESTs 

Homo sapiens cDNA FU1 3181 fis, done NT 
Homo sapiens mRNA; cDNA DKFZp564l153 (fir 
ubiquBn associated protein 
osteoglycin (osteoinductive factor, mime 
gb«C5-LT0054-1402(KH)13^01 LT0054 Homo 
ESTs 

zinc finger protein 200 
ceUcfivisbn cycle 25B 
EST 
ESTs 

ESTs, Weakly simiar to T00066 hypotheb' 
nbonudease, RNase A family, 2 (Over, 
HAS p21 protein activator (GTPase activa 
chemoWne (OC motif) receptors 
gbX}V1-HTO412-270300-123KJ10 HT0412 Homo 

hypothetical protein 
ESTs 

gb:EST186958 HCC ceD fine (rnatastasls t 
guarryfete cyclase 1, soluble, alpha 3 
Homo sapiens cDNA: FU22050 fis, done H 
ESTs 

gb:RC0<n , Q37&«2901QW>32-b04 CT0379 Homo 
twist (DrosophOa) homolog (acrocephalos 
gi>:IL-BT166-18Q399-010 BT166 Homo sapien 
gb:RCO-CTQ323-231199-031-b05 CT0323 Homo 
ESTs 
ESTs 



15.26 
1525 
1524 
1522 
1522 
15.16 
15.14 
15.14 
15.12 
15.12 
15.12 
15.06 
15.03 

^5D2 

15J00 
15D0 
1456 
1450 
1424 
1420 
14.72 
1429 
1427 
1425 
1423 
1420 
1425 
1424 



Hs.157150 ESTs, WeaHy similar to zinc ringer prot 
HS23804 ESTs 



14.44 
1442 
1440 
1426 
1426 
1422 
1421 
1423 
1422 
1422 
1422 
1422 
1422 
14.16 
14.14 
1427 
1427 
1425 
1422 
1328 
1328 
1325 
1320 
1328 
1325 
1320 
1320 
13.78 
ia78 
13.76 
13.75 
13.74 
13.72 
1329 
1327 
1322 
1320 
1323 
1323 
1349 
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430685 A1690234" Hs.191666 ESTs, Weakly similar to reverse transcrl 1347 

414052 AW578B49 Hs.2835S2 ESTs, Weakly similar to unnamed protein 1346 

447858 AW080339 K&211911 ESTs 1344 

435716 A1573283 Hs38458 ESTs 1344 

5 439120 H56389 gb^t37c03Jl Soares_pineaLgIandJ43HPQ 1343 

402788 1340 

451591 AA886446 Hs.146278 ESTs 1340 

405411 1338 

426558 AW188574 Hs24218 ESTs 1334 

10 453506 AA132818 Hs.110407 ESTs, Weakly similar to coded for by C. 1333 

416445 ALO43004 Hs300678 Human serfne/lhrBonine kinase mRNA, part 13-32 

457084 AI074149 Hs.150905 ESTs, WeaWy similar to chondral 4-su 1332 

403838 1332 

427337 Z46223 Hs.176863 Fc fragment of IgQ, tow affinity lllb.r 1330 

IS 434318 AW207552 Hs,1 16328 ESTs, WeaWy similar to (JJ134E1 5.1 [H.sa 1338 

435193 N41359 H&218107 ESTs 1338 

414756 AW451101 Hs.159489 ESTs, Moderately similar to hexoklnasa I 1337 

420626 AF043722 Hs.99491 RAS guanyl releasing protein 2 (calcium 1336 

420052 AA418850 Hs44410 ESTs 1335 

20 414020 NM.002984 Hs.75703 smal inducible cytokine A4 (homologous 1335 

403851 1334 

422647 W07492 Hs.157101 ESTs 1331 

433598 AI762836 Hs371433 ESTs, Moderately similar to ALU2_HUMAN A 1331 

409065 AB033113 Hs£0187 K1AA12B7 protein 1330 

25 435063 R21868 H&57734 G protein-coupled receptor Idnase-intera 13.19 

439367 BE386844 Hs348746 ESTs 13.17 

451957 AI796320 Hs.10299 Homo sapiens cDNA RJ13545 fis, done PL 13.16 

420569 AA278362 Hs389062 Homo sapiens cDNA RJ12334 fis, clone MA 13.14 

447883 BE2628Q2 Hs4909 rMtopf (Xenopus faevis) homolog 3 13.07 

30 426490 NM.001621 Hs.1 70087 aryt hydrocarbon receptor 13D6 

414789 AA155859 Hs.79708 ESTs 13.G5 

451416 BE387790 HS36369 ESTs 13.04 

443494 T9971 9 Hs370404 Homo sapiens cDNA: FU22389 fis, done H 13.03 

425878 AW964806 Hs38085 ESTs, Weakly similar to putative glycine 13.02 

35 431912 AI660552 Hs.154903 ESTs, Weakly similar to A56154 Abl subst 1100 

407122 H20276 Hs.31742 ESTs 13.00 

456491 AL137466 Hs37277 Homo sapiens mRNA; cDNA DKFZp434H1322 (f 1239 

448172 N75276 Hs.1 35904 ESTs 1238 

452144 AA032197 Hs. 102558 ESTs 1236 

40 419953 BE267154 Hs.125752 ESTs 1236 

416182 NM.004354 Hs.76069 cyd!nG2 1234 

451154 AA015879 Hs33536 ESTs 1233 

412257 AW903830 ofc:CM4-NN1037-250400-155-h04 NN1037 Homo 1233 

449784 AW161319 Hs.12915 ESTs 1232 

45 432695 063480 Hs378634 WAA0146 protein 1232 

454105 NMJXJ1259 Hs38481 cyc&HJependent kinase 6 1232 

439093 AA534163 H&5476 serine protease Inhibitor, Kazal type, 5 1230 

416098 H41324 Hs31561 ESTs, Moderately similar to ST1 B_HUMAN S 1238 

424897 D63216 Hs.153684 frizzfed-related protein 1238 

50 414604 AU076649 Hs.76556 growth arrest and DNA-danage^nducfote 3 1238 

414664 AA587775 Hs36295 Homo sapiens HSPC311 mRNA, partial cds 1234 

452560 BE077084 gb:RC5-BTO603-22a200K)13C07 BT0603 Homo 1234 

413859 NMJW0878 Hs.75598 Interteukin 2 receptor, beta • 1230 

452359 BE167229 Hs39206 Homo sapiens done 24659 mRNA sequence 1230 

55 435886 BE265639 Hs.12126 hepatocellular carcinoma-associated antj 12.78 

445230 U97018 Hs.12451 echinorjerm microtubule-assodated protei 12.78 

412226 W26786 gb:15d7 Human retina cDNA randomly prime 12.77 

446619 AU076643 Hs.313 secreted phosphoprotein 1 (osteopontin, 12.76 

447769 AW873704 Hs.48764 ESTs 12.76 

60 414478 AI306389 Hs.76240 adenylate kinase 1 12.76 

425383 D83407 Hs.156007 Down syndrome critical region gene 1-fik 1238 

450704 H85157 Hs.40696 ESTs 1236 

405856 1236 

412935 BE267045 Hs.75064 tobuDrvspedfrc chaperone c 1235 

65 402802 1232 

452588 AA889120 Hs.1 10637 HomeoboxAlO 1232 

419978 NM.001454 Hs.93974 forkheadboxJI 1232 

403137 1230 

430226 BE245562 Hs3551 adrenergic, beta-2- ( receptor, surface 1237 
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448076 AJ133123 Hs20196 adonylats cyclase 9 1256 

450462 F07097 H&300828 Homo sapiens mRNA Ml length insert cON 1254 

405236 1252 

. 409292 AAD71051 gbzm58eQ5.s1 Stiatagene fibroblast (937 1247 

5 421540 AA7676B9 Hs.10242 ESTs 1247 

425840 AW978731 H&301824 ESTs 1244 

443181 AI039201 Hs34548 ESTs 1242 

452436 BB077546 HS31447 ESTs 1242 

455183 AW984111 gb:RCO-HN0007-1 6030OO1 1-f09 HNDQ07 Homo 1240 

10 432887 AI926047 Hs.1 62859 ESTs 1237 

410494 M36564 H&64016 protein S (alpha) 12-38 

439024 R96696 H&35598 ESTs 1236 

451246 AY/189232 Hs39140 cutaneous T-ceO lymphoma tumor antigen 1236 

432692 AL042615 Hs, 15995 ESTs 1235 

15 418982 AI348838 Hs.1 3073 ESTs 1235 

414516 AI307802 H&279551 ESTs 1234 

440134 BE410734 gb301 30161 8F1 NIH_MGC_21 Homo sapiens c 1229 

443873 AL048542 Hs.16291 ESTs 12J28 

401288 1226 

20 454020 AW962845 Hs256527 ESTs 1224 

420077 AW512260 Hs37767 ESTs 1224 

443837 AI984625 H&9884 spindle pole body protein 1224 

407519 X64979 gb:H.sapiens mRNA HTPCRX01 for olfactory 1223 

435839 AF249744 Ks25951 Rho guanine nucleotide exchange factor ( 1222 

25 448552 AW973653 Hs20104 hypothetical protein FU00052 1220 

405325 1220 

451009 AA013140 Hs.115707 ESTs 12.18 

423066 Y18264 Ha.120171 ESTs 12.17 

439556 AI623752 Hs.1 63603 ESTs 12.16 

30 443062 N77899 Hs3963 Homo sapiens mRNA fufi length Insert cON 12.15 

445873 AA250970 Hs251946 Homo sapiens cONA: RJ23107 fe, clone L 12.14 

453542 AW836724 Hs33190 Homo sapiens mRNA expressed only in plac 12.11 

440106 AA864968 Hs.127699 ESTs 12.10 

417605 AF006809 Hs.82294 regulator of G-protein signalling 3 12.10 

35 440288 U29589 Hs.7138 cholinergic receptor, muscarinic 3 12.04 

420061 AW024937 Hs29410 ESTs 1232 

458727 AI022813 Hs32679 Homo sapiens done CDABP0014 mRNA sequen 1136 

445407 AI222658 Hs221889 ESTs, Weakfy similar to fa costa [D jnela 1155 

416250 U29926 Hs33918 adenosine monophosphate deaminase (isofo 11.94 

40 414129 AI89Q287 Hs270798 ESTs 11.93 

409799 D11928 Hs.76845 phosphoserine phosphatase-Qa 11.92 

438461 AW075485 HS288049 phosphoserine ejninotransferase 1122 

443912 R37257 Hs.1 84780 ESTs 1122 

424606 AA343936 gb£ST497B6 GaD bladder I Homo sapiens 11.90 

45 434217 AW014795 Hs23349 ESTs 1120 

451533 NAL004657 Hs26530 serum deprivation response (phosphatidyl 1130 

422423 AF283777 Hs.116481 CD72 antigen 1139 

409398 AW386461 gfcPMm0019-121299^X)4-F02 PT0019 Homo 1129 

423853 AB011537 Hs.133466 sfit (Drosophlla) homotog 1 1132 

50 446180 AI074413 Hs.14220 hypothetical protein RJ20450 1120 

414341 D80004 Hs.75909 KIAA01 82 protein 1130 

406538 11 79 

433253 AW450502 Hs24218 ESTs - 1179 

447397 BE247676 Hs.1 8442 E-1 enzyme 11.78 

55 451684 AF216751 Hs26813 COA14 11.76 

416862 R23765 Hs23575 ESTs 11.74 

425770 NM.014363 Hs. 1 59492 spastic ataxia of Chartevoix-Saguenay (s 11.72 

428828 AL046842 Hs.194019 attracb'n 11.72 

433037 NMJD14156 Hs279938 HSPC067 protein 11.72 

60 447476 BE293466 Hs20880 ESTs 11.72 

452092 BE245374 Hs27842 riypomeiical protein RJ 11210 1172 

412922 M60721 Hs.74870 H2D prosophSa)-Eke homeo box 1 11.72 

401660 NM.005578 Hs.180398 UM domain-containing preferred transtoc 1139 

422576 BE548555 Hs.11 8554 CGl-83 protein 1138 

65 450203 AF097994 Hs301528 L^urenme/e^ra^rrirujadlpate aminotra 1138 

410531 AW752953 gtK3VOCT0224-261C39^g02 CTQ224 Homo 1137 

425917 W28517 Hs.1 17167 Homo sapiertscONA:FU23067fis, dene L 1136 

418693 AI750878 Hs37409 mrornbospondJn 1 1134 

400557 1132 
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416188 BE157280 Hs.79070 v-myc avian myelocytomatosis viral oncog 11.60 

419047 AW952771 Hs30043 ESTs 11.59 

420441 AI986160 HS38448 ESTs 1159 

400885 11^7 

5 409853 AW502327 gbUWF-BROp-aka-a-07-OUl.rl NiH_MGC_5 1156 

400802 1156 

434540 NM.016045 Hs5184 TH1 drosophDa homolog 1155 

431449 M55994 H&256278 tumor necrosis (actor receptor supertami 1155 

425928 S55736 H&238852 ESTs, WeaMy simSar to hypothetical pro 1154 

10 434701 AA460479 H&4096 K1AA0742 protein 1153 

434228 Z42047 H&283978 ESTs; KIAA0738 gene product 1152 

420729 AW964897 H&290825 ESTs • 1152 

428328 AA426080 H&98489 ESTs 1150 

433887 AW204232 Ha279522 ESTs 1150 

IS 414812 X72755 Hs.77367 monokine induced by gamma Interferon 11.46 

457718 F18572 Hs3297B ESTs 11-44 

452260 AA453208 Hs38726 RAB9, member RAS oncogene family 11.42 

459029 AA131376 Hs3852Q3 fibroblast growth factor 12 11.42 

456267 AI127958 H&83393 cystatinE/M 1139 

20 433285 AW975944 H&237396 ESTs 1138 

449186 AW291876 Hs.196988 ESTs 1137 

447861 AI434593 Hs.164294 ESTs 1137 

456023 R00028 gb.-ye70a06.s1 Soares fetal fiver spleen 1136 

439444 AI277652 Hs54578 ESTs 11.31 

25 401183 1131 

430886 L36149 Hs3481 16 cherrwWne (C motif) XC receptor 1 1128 

450784 AW246803 Hs.47289 ESTs 1138 

452391 AL044829 H&29331 carnitine palmitoyltransferase I, muscle 1137 

449625 NM 014253 Hs33796 odz (odd Oz/terwn, DrosophBa) homolog 1 1136 

30 456827 AA075687 Hs.147176 epidermal growth factor receptor substra 1134 

439328 W07411 H&11B212 ESTs, Moderately similar to ALU3_HUMAN A 1134 

432093 H28383 gb.-yt52c03.r1 Soares breast 3NbHBst Homo 1134 

407335 AA631047 Hs.158761 Homo sapiens cDNA RJ13054 fis, done NT 1133 

442501 AA315267 H&33128 ESTs 1132 

35 429746 AJ237672 Hs314142 5,10™%Jenetetrahydrofokte reductase 1131 

422858 R35398 gb:yg84g10.r1 Soares infant brain 1NBH 1130 

415156 X84808 Hs.78060 phosphoryiase kinase, beta 1130 

446713 AV660122 Hs382675 ESTs 1130 

452221 C21322 Hs.11577 ESTs 1130 

40 418261 W789Q2 Hs393297 ESTs 11.17 

433332 A1367347 Hs.127809 ESTs 11.16 

434539 AW748078 Hs314410 ESTs 11.16 

413471 BE142098 gb£M44fT0137-220999^)17-<I11 KTO137Homo 11.14 

410037 AB020725 Hs58009 WAA0918 protein 11.14 

45 405601 11.13 

458332 AI000341 Hs320491 ESTs 11.12 

427654 AA410183 Hs.137475 ESTs 11.12 

427138 N77624 Hs.173717 phosphaMc add phosphatase type 2B 11.10 

431475 AI567669 Hs387316 ESTs 11.10 

50 425710 AFO30880 Hs.159275 solute carrier family, member 4 11.08 

413748 AW104057 Hs.19193 ESTs 1U07 

409208 Y00093 Hs51077 integrin, alpha X (antigen CD11C (p150), 1137 

457278 W92745 Hs.193324 ESTs - 11-03 

407031 U52077 gb:Human mariner! transposase gene, comp 11JD2 

55 445701 AF055581 Hs.13131 lyn^hocyte adaptor protein 11J02 

408338 AW867079 gb:MR1^NQO33-12O4O(K)02-c10 SN0033 Homo 1035 

401030 BE382701 Hs35960 v-myc avian myelocyte matosis vtra! relat 1035 

437891 AW006969 Hs.6311 hypothetical protein FU20859 1034 

453874 AW591783 Hs36131 collagen, type XIV, alpha 1 (unduCn) 1034 

60 421562 AA530994 Hs.105803 ghreOn precursor 1032 

413431 AW246428 Hs.75355 ubkjuffirhconjugafing enzyme E2N (homolo 1032 

400132 1032 

436420 AA443968 Hs31595 ESTs 1030 

424880 NM.000328 Hs.153614 refinftis pigmentosa GTPase regulator 1058 

65 433264 D85782 Hs3229 cysteine dloxygenase, type I 1038 

429842 AT366213 Hs.173422 KIAA1605 protein 1037 

412405 AW948128 gbflC0*rVnT0013-28030(H)31^12MTrxi3rfemo 1035 

400615 1030 

425018 BE245277 Hs.154196 E4F transcription factor 1 1030 
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BE243628 

BE176862 

BE218418 

AW803564 

AW377314 

AI383497 

R40978 

AA694070 

NM.006558 

U24578 

AW863261 

AA220238 

AF151879 

AP075031 

AW080953 



400880 
415875 
434715 
406851 
413409 
418489 
419465 
419544 
432180 
413822 
437446 
415701 
443790 
458873 
415082 
429124 
417187 



AA715328 

AA128423 

D50918 

R10184 

AI142350 

AA178955 

AW248508 

AK001332 
AF059214 

AA894876 

BE005346 

AA609784 

AI638418 

U76421 

AW500239 

AI909154 

Y18418 



AA788946 

NWL003878 

NM_003500 

AW150717 

AA160000 

AW505086 

AB011151 

AW067605 

NM.000030 



423445 NMJM4324 
409995 AW9S0597 
432242 AW022715 
50 406394 AA172106 



406189 
422263 
401598 



416511 
427274 
401384 



418400 
437401 
403690 
423790 
434094 
434967 
432827 



AW411307 

AA172106 

T89832 

NM_006762 

NW.005211 

D13168 

AF037062 

AI684746 

AJ364997 

BE243Q28 

AA757196 

BE152393 
AA305599 
AW975009 
Z68128 



gb:TCBAP1D1053 Pediatric pre-B cell acut 
gbflOWT0587-17030(KJ12-a04 HT0587 Homo 

Hs201802 ESTs 

Hs288850 ESTs 

H&5364 DKFZP564I052 protein 

Hs.131811 ESTs, Wealdy slmOar to ALU1_HUMAN ALU S 

H&271498 ESTs, Moderatefy similar to ALU1„HUMAN A 

H&268835 ESTs 

Hs.1 3565 Sam68-fike phosphotyrosine protein, T-ST 

Hs.1 70250 cornptemant component 4A 

Hs.15036 ESTs, Highly sfmllar to AF1 61358 1 HSPCO 

Hs34988 rfconudease P (38kD) 

H&26706 CGH21 protein 

H&293Z7 ESTs 

gbxc28c12j(1 NCLCGAP_Co18 Homo sapiens 
Hs.171096 Homo sapiens EST from done DKFZp434A041 
Hs291205 ESTs 
Hs.40300 calpain3,(p94) 
Hs50998 KIAA0128 protein; sepfin 2 
Hs, 191 987 ESTs, Weakly similar to ALU1_HUMAN ALU S 
Hs.146735 EST 
HS271439 ESTs 
Hs279727 ESTs; 

Hs44672 hypothetical protein FU10470 
Hs.194687 cholesterol 254iydroxylese 

Hs5687 protein phosphatase 1B (formerly 2C), ma 
Hs.1 16410 ESTs 

Hs.180255 major histocompatibility complex, class 
H&21745 ESTs 

Hs55302 adenosine deaminase, RNA-specfflc, B1 (h 
H&21187 Homo sapiens cDNA: RJ23068 fis, clone L 

gb«V-BT20£H)10499-007 BT200 Homo sapien 
H&272822 RuvB (E coO homo!og)-fike 1 
Hs272044 ESTs, Weakly similar to ALU1_HUMAN ALU S 
Hs.1 6869 ESTs, Moderately similar to CA1C RAT COL 
Hs.78619 gamma-giutamyl hydrolase (conjugase, foi 
Hs^795 acyl-Coenzyma A oxidase 2, branched chai 
Hs296176 STAT induced STAT inhibitor 3 
Hs.137398 ESTs 

Hs.196914 minor hlstocompatfcility anfigen HA-1 
Hs51505 WAA0579 protein 
Hs.172665 methyteriatetrahydrofoiate dehydrogenase 
H&271366 alanine-glyoxylate aminotransferase homo 
Hs.17126 ESTs 

Hs.1 28749 aipria-methyiacyi-CoA racemase 
Hs50164 ESTs 

Hs.162160 ESTs, Wealdy similar to ALU4_HUMAN ALU S 
Hs.1 10950 Rag C protein 

Hs.1 14311 COC45 (cell division cycle 45, S.cerevts 

Hs.1 10950 RagCproteki 

Hs.1 70278 ESTs 

Hs.79356 Lysosornal-associated multlspanning membr 

Hs.174142 colony stimulating factor 1 receptor, fo 

Hs52002 endotheSn receptor type B 
Hs.172914 refinol dehydrogenase 5 (1 1-cisand 9-cis 
Hs.1 19274 ESTs 
Hs.7572 ESTs 
Hs501989 K1AA0246 protein 
Hs.121190 ESTs 

gbCM2-HT0323-171 199-033-aQ8 HT0323 Homo 
H&238205 hypothetical protein PRO2013 
Hs292274 ESTs 

Hs5109 Rho GTPase activating protein 4 
Hs54004 ESTs 
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452234 


AW084176 


H&223296 ESTs 


10.14 


445629 


AI245701 


gtoqk31K)5.x1 NO.CGAP JCd3 Homo sapiens 


10.13 


457236 


AA626142 


Hs.179991 ESTs, WeaMy similar to KPCEJWMAN PROTE 


10.13 


444605 


M174603 


H&254105 enolasel, (alpha) 


10.12 


450313 


AI038989 


H&24809 hypothetical protein RJ10826 


10.12 


407482 


NM.006056 




10.12 


449971 


AAB07346 


H&288581 Homo sapiens cDNA FU14296 Rs, clone PL 


10.11 


441201 


AW1 18822 


Hs.128757 ESTs 


10.10 


435157 


AW014605 


Hs.179872 ESTs 


10.10 


417308 


H60720 


H&81892 K1AA0101 gene product 


10.09 


442582 


A1204266 


Hs.179303 ESTs 


10.05 


437252 


AI433833 


Hs.164159 ESTs, Weakly sfrrtBar to ALU1 JHUMAN ALU S 


10.04 


448663 


BE614599 


Hs.106823 H sapiens gene from PAC 426I6, slmQar t 


10J04 


434467 


8E552368 


H&231853 Homo sapiens cDNARJI 34450s, clone PL 


1034 


423698 


AA329796 


Hs.1098 DKFZp434J1813 protein 


10.02 


412707 


AW206373 


Hs.16443 Homo sapiens cONA: RJ21721 fis, done C 


10J00 


414658 


X58528 


Hs.76781 ATP-binding cassette, sub-family D (AID) 


10D0 


421832 


NM.016098 


Hs.108725 HSPC040 protein 


10.00 


423554 


M90516 


Hs.1674 ghitamlne-mjctoss^hosphate transamln 


10.00 


452039 


A1922988 


Hs.172510 ESTs 


1O00 


434673 


AW137442 


Hs.136965 ESTs 


10iX) 


427978 


AA418280 


Hs.180040 Homo sapiens cONA: FU22439 fis, done H 


10J00 


457803 


BE501815 


Hs.198011 ESTs 


9.99 


428279 


AA425310 


Hs.155766 ESTs 


9.98 


444412 


AI147652 


Hs.216381 Homo sapiens done HH409 unknown mRNA 


9.98 


417049 


N72394 


Hs.44862 ESTs 


9.96 


427509 


M62505 


H&2161 complement component 5 receptor 1 (05a I 


9.96 


445424 


AB028945 


Hs.12698 cortsctin SH3 domain-binding protein 


9.96 


443678 


AW009605 


HS231923 ESTs 


9.96 


447567 


AW474513 


H&224397 ESTs, WeaJdy similar to B48013 proBne-r 


9.94 


414709 


AA704703 


Hs.77031 Sp2 transcription factor 


9.94 


434598 


T59538 


gb.7b65g12^1 Stratagene ovary (937217) 


9.94 


427630 


BE276115 


Hs.144980 ESTs, WeaWy sfmflar to CA13_HUMAN COLLA 


9.93 


416111 


AA033813 


Hs.79018 chromatin assembly factor 1 , suburb A ( 


9.92 


423349 


AF010258 


Hs.127428 homeoboxA9 


9.92 


424308 


AW975531 


Hs.154443 mfoichromosome maintenance detent (S. 


9.92 


416814 


AW192307 


HS30042 doIichyi-P^/anSGIcr^^PKloridiyJgl 


9.90 


417886 


AA481003 


HSJ97128 ESTs 


9.90 


425174 


D87450 


Hs.154978 K1AA0261 protein 


9.90 


438171 


AW976507 


HS293515 ESTs 


9.90 


421884 


AW972187 


Hs.1 10443 hypomeflcal protein FU22215 


9.89 


408597 


NM.005291 


Hs.46453 G proteirvcoupied receptor 17 


9.88 


413907 


AI097570 


Hs.71222 ESTs 


9.87 


451298 


AW801383 


Hs.118578 H^apiens mRNA for ribosomal protein L18 


9.86 


433409 


A12788Q2 


H&25661 ESTs 


935 


450360 


AW117416 


H&245484 ESTs 


935 


433104 


AL043002 


Hs.1 28246 ESTs, Moderately similar to unnamed prat 


934 


449824 


AI962552 


H&226765 ESTs 


934 


452744 


A1267652 


Hs3Q504 Homo sapiens mRNA; cDNA DKFZp434E082 (fr 


932 


431066 


AF026273 


Hs249175 toterteukm-1 receptor-assodated kinase 


932 


426457 


AW894667 


Hs.1 69965 chimertn (chirnaerin) 1 


930 


443371 


AI792888 


Hs.145489 ESTs 


930 


437159 


AL050072 


gfcHomo sapiens mRNA; cDNA DKFZp566£1346 


- 9.75 


425242 


D13635 


Hs.155287 KIAA0010 gene product 


9.74 


447498 


N67619 


Hs.43687 ESTs 


9.74 


426759 


AI590401 


H&21213 ESTs 


9.73 


435129 


A1381659 


HS267086 ESTs 


9.72 


437672 


AW748265 


Hs.5741 flavohernoprotein D5+05R 


9.72 


438209 


AL120659 


Hs.6111 K1AA0307 gene product 


9.72 


438440 


AA807228 


Hs225161 ESTs 


9.72 


449720 


AA311152 


HS288708 ESTs; WeaWy similar to K1AAQ226 [H^api 


9.72 


414291 


M289619 


Hs.13040 ESTs 


9.72 


436208 


AK001451 


H&265561 CD2-assodated protein 


9.70 


446896 


T15767 


HS22452 Homo sapiens cDNA: RJ21C84 fis, done C 


9.70 


412667 


AW977540 


HSJ269254 ESTs 


9.70 


423301 


867580 


H&1645 cytochrome P450, subfamily IVA, poiypept 


937 


440757 


AW118S45 


Hs.1 60004 ESTs 


937 


441412 


AI393657 


Hs.1 59750 ESTs 


9.66 


421044 


AP061871 


Hs.101302 conagsn, type XII, alpha 1 


9.66 
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414726 BE466863 H&28Q099 ESTs 9.68 

418485 R91679 Hs.124981 ESTs 9.66 

433480 XQ2422 Hs.181125 immunogtobulin lambda locus 9.65 

441530 AE48301 Hs.127112 ESTs 9X5 

5 433533 D53304 Hs.65394 ESTs 9.65 

421470 R27498 Hs.1 378 annexInA3 9X4 

438613 C05569 H&243122 hypothetical protein FLM3Q57 similar to 9X4 

429324 AA468101 Hs.1 99245 Activation escape 1 9X2 

450244 AA007534 Hs.125062 ESTs 9.62 

10 407660 AW063190 Hs279101 ESTs 9X1 

406554 9j60 

. 426404 AA377607 H&273138 ESTs 9X8 

447045 AW392394 H&278569 K1AA0054 gene product 9X8 

449894 AK001578 H&24129 hypothetical protein RJt 071 6 9X8 

IS 448376 A1494332 Hs.196963 ESTs 9X8 

407902 AL1 17474 Hs.41181 Homo sapiens mRNA; cDNA DKFZp727C191 (fr 9X6 

446572 AV659151 H&262961 ESTs 9X6 

459245 BE242623 HsX1939 manic fringe (DrosophUa) homotag 9X5 

423545 AP000692 Hs.129781 chromosome 21 open reading frame 5 9X4 

20 414697 BE266134 Hs.76927 translocase of outer mitochondria) membr 9X4 

410846 AW807057 gb:MR4-ST0062-O31199418-bO3 STD062 Homo 9X2 

421181 NNL005574 Hs.1 84585 UM domain only 2 (mombotin-fike 1) 9X2 

427308 D26067 Hs.174905 K1AA0033 protein 9X2 

415995 NMJD04573 HsMA phosphoCpase C, beta 2 9X1 

25 434846 AW295389 Hs.1 19768 ESTs 9X1 

414342 AA742161 Hs.75912 Homo sapiens cDNA: FU22199 fe, done H 9X0 

416959 D28459 HsX0612 ubiquffirwxmjugating enzyme E2A {RADB h 9X0 

443123 AA094538 Hs.6588 ESTs 9X0 

439312 AA833902 Hs270745 ESTs 9.48 

30 449375 R07114 H&271224 ESTs 9.48 

436357 AJ132085 gbflomo sapiens mRNA for axonemal dyneln 8.44 

458723 AW137726 Hs244352 ESTs, Moderately simSar to laminin alph 9.44 

457526 AW450584 Hs.192131 ESTs, Weakly similar to RIBB [H^apiens] 9.43 

404741 9.43 

35 42409 NMJXB428 Hs.1 16237 vavl oncogene 943 

403708 9.42 

408806 AW847814 Hs289005 Homo sapiens cDNA: FU21532fis, clone C 942 

417380 T06809 gb£ST04698 Fetal brain, Stratagene (cat 942 

422501 AA354690 Hs.144967 ESTs 942 

40 426197 AA004410 Hs.167835 acyl-Coenzyme A oxidase 1,patmitoyi 9.42 

452624 AU076606 HsX0Q54 coagulation factor V (proaccelerin, tebi . 942 

412110 AW893569 ^flCCM^O021^4O4O(W21-c1OKN0021 Homo 941 

414158 AA361623 H&268775 Homo sapiens cDNARJI 3900 ns, clone TH 941 

408101 AW968504 Hs.1 23073 CDC2-feiaied protein kinase 7 940 

45 414171 AA360328 HsX65 RAP1A, member of RAS oncogene family 9.40 

415847 U04045 Hs.78934 mutS (E. coll) homolog 2 (colon cancer, 940 

426959 BE262745 gb:601 153869F1 N1H_MGCJ9 Homo sapiens c 9X9 

417519 A1689987 Hs.177669 ESTs, Weakfy similar to RMS1_HUMAN REGUL 9X9 

457181 BE514362 Hs^96422 FK5064in<£ng protein 3 (25M)) 9X9 

50 402835 - 9X8 

404632 9X8 

446566 H95741 Hs.17914 Homo sapiens cONA: FU22801 fe, done K 9X7 

455369 AW903533 gb^14IN1031-06040f>178-d05 NN1031 Homo * 9X7 

444001 AI095087 Hs.1 52299 ESTs, Moderately similar to ALU5_HUMAN A 9X6 

55 458191 AI420611 Hs.127832 ESTs 9.36 

431374 BE2S8532 Hs251871 OTP synthase 9X4 

429327 AA283981 Hs.1 99248 prostaglandin E receptor 4 (subtype EP4) 9X3 

407061 X97748 gb:H.saplens PTX3 gene promotor region. 9.33 

416967 BB616731 HsX0645 interferon regutatory tactor 1 9X3 

60 423013 AW875443 Ks 22209 secreted modular calcrunvbinding protein 9X3 

439461 AA693960 Hs.103158 ESTs 9X3 

418830 BE513731 HsX8959 Human DNA sequence from clone 967N21 on 9X2 

422763 AA033699 HsX3938 ESTs, Moderaiery.stmiTar to MASP-2 [H.sa 9.32 

442739 NM_007274 HsX679 cytosofic acyl coenzyme A thtoester hydr 9X2 

65 452859 AI300555 H&288158 Homo sapiens cDNA: FU23591 fe, clone L 9X2 

403237 9X2 

415000 AW025529 H&239812 ESTs, Weakly similar to CALKLHU MAN CALMO 9X1 

417951 AW976410 H&289069 Homo sapiens cDNA: FU21016 fe, clone C 9.30 

419066 298492 HsX975 PRO1073 protein 9X0 



184 



WO 02/30268 



PCT/US01/32045 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



448443 


AW167128 


Hs231834 ESTs 


9,30 


405125 






920 


409768 


AW499568 


gbUI-HF^R0p<hh-C3^HJIj1 NIH_MGC_5 


928 


453708 


A1191611 


Hs54629 ESTs 


928 


442271 


AFD00652 


Hs2160 syndecan binding protein (syntenin) 


927 


410055 


AJ250839 


H&58241 gene for serine/threonine protein kinase 


926 


448692 


AW013907 


K&224276 ESTs, Moderately similar to predicted us 


926 


417381 


AF164142 


Hs22042 solute canter family 23 (nudeobase tra 


925 


422497 


D29642 


Hs.1528 KIAA0053 gene product 


925 


414140 


AA281279 


Hs23317 ESTs 


924 


435980 


AF274571 


Hs.129142 ESTs; Weakly similar to DEOXYR1BONUCLEAS 


924 


458530 


BE395035 


Hs.1 99889 ESTs, WeakrysimOar to K1AAD874 protein 


924 


402585 






924 


420819 


AA280700 


gb:zs95h11.s1 NCLCGAP GC81 Homo sapiens 


923 


444755 


AA431791 


Hs.183001 ESTs 


922 


411630 


U42349 


H&71119 Putative prostate cancer tumor suppresso 


922 


421246 


AW582962 


Hs^00961 ESTs, HlgNy similar to AF151 805 1 CGW 


920 


421924 


BB14514 


Hs.109606 corontn, acfin-bkwfing protein, 1A 


9.19 


414888 


AL039185 


Hs.77558 thyroid hormone receptor tnteractor 7 


9.18 


434267 


A1206589 


Hs.1 16243 ESTs 


9.17 


409213 


U61412 


H&51133 PTK6 protein tyrosine kinase 6 


9.17 


428242 


K55709 


H&2250 leukemia tnhlbftory factor (cholinergic 


9.16 


451736 


AW080356 


H&293684 ESTs, Weakry similar to alternatively sp 


9.15 


413627 


BE1 82082 


Hs246973 ESTs 


9.14 


416134 


AA528402 


Hs.74861 activated RNA polymerase II transoiptb 


9.14 


449251 


AW151660 


H&31444 ESTs 


9.14 


452813 


U54727 


Hs.191445 ESTs 


9.14 


443622 


A1911527 


Hs.1 1805 ESTs 


9.14 


413260 


BE075281 


gbf M1-BT0585-29Q200-005-d07 BT0585 Homo 


9.12 


413450 


Z99716 


Hs.75372 N-acetytp^Iactosamlnidase, alpha* 


9.12 


446442 


BE221533 


H&257858 ESTs 


9.12 


438540 


AA810021 


Hs.136906 ESTs 


9.12 


426251 


M24283 


Hs.1 68383 Intercellular adhesion molecule 1 (CD54) 


9.11 


410290 


AA402307 


Hs.73818 ubkjuinol-cytochrome c reductase hinge p 


9.10 


437398 


AA913736 


Hs.126715 ESTs 


9.10 


421559 


NM.014720 


Hs.1 05751 Sts20-related serine/threonine kinase 


9.10 


439899 


AF086534 


Hs.1 87561 ESTs, Moderately sknOar to ALU1JHUMAN A 


9.10 


430799 


C19035 


Hs.1 64259 ESTs 


0.09 


424544 


M88700 


Hs.150403 dcpa decarboxylase (aromafic L-amlno acl 


9.08 


453942 


AW1 90920 


Hs.1 9928 ESTs 


9,03 


425844 


768073 


Hs.159628 serine (or cysteine) proteinase inhlblto 


9.08 


434658 


Ai 624436 


Hs.194488 ESTs 


9.07 


453999 


BE328153 


H&240087 ESTs 


9.06 


436490 


R71543 


Hs.18713 ESTs 


9.05 


409192 


AA065131 


H&233439 ESTs, Weakly similar to ALU7_HUMAN ALU S 


9.05 


446223 


BE300091 


Hs.1 19699 hypothetical protein FU12969 


9.04 


447247 


AW369351 


Hs287955 Homo sapiens cONA FU 13090 fis, done NT 


9.04 


450094 


AI174947 


Hs295789 Homo sapiens mRNA; cONA DKFZp56401164 (f 


9.04 


432012 


AW301344 


Hs.195969 ESTs 


9.04 


422520 


AU 076730 


Hs.1 17977 kinesln 2 (60-70kD) 


9.02 


418650 


BE386750 


H&86978 prolyl endopepOdaie 


9.02 


423008 


M81590 


Hs.1 2301 6 5-hydroxytryptamine (serotonin) receptor 


9.02 


438476 


AA326108 


Hs£3631 ESTs 


• 9.02 


448206 


BE622585 


H&3731 ESTs 


9.02 


431574 


AW572659 


H&261373 adenosine A2b receptor pseudogsne 


9.01 


443453 


R&878 


Hs269882 ESTs 


9.01 


435472 


AW972330 


H&283Q22 triggering receptor expressed on myeloid 


9.01 


420337 


AW295840 


Hs.14555 Homo sapiens cONA: FU2151 3 fis, clone C 


9.00 


449810 


AB008681 


Hs23994 ecfivin A receptor, type IIB 


9.00 


406760 


AA902388 


H&288 ribosornaJ protein L4 


8.99 


429169 


AW341130 


Hs.197757 ESTs, Moderately similar to FGFEJiUMAN F 


8.99 


421326 


AFQ51428 


Hs.1 03504 estrogen receptor 2 (ER beta) 


827 


425491 


AA883316 


Hs255221 ESTs 


8.96 


425516 


BEQ007D7 


H&29567 ESTs 


8.98 


439773 


A1051313 


Hs.143315 ESTs 


8.98 


443247 


BE614387 


HS47378 ESTs 


826 


456523 


AI084125 


Hs.108106 transcription factor 


8.85 


438707 


L08239 


H&5326 porcupine 


825 


4Q2240 




825 
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444152 AI125694 Hs.149305 Homo sapiens cDNA RJ14264 fis, done PL 8.95 

409842 AW501756 p^AJl-HMR0p^TK>OWWLr1 NIH.MGC.5 854 

416277 W78765 H&73580 ESTs 8.94 

456697 AI908006 Hs.111334 terrain, Qght polypeptide 854 

5 410762 AF226053 H&66170 HSKM-fl protein 8.92 

412942 AL120344 Hs.75074 mitogervactivated protein kinase-activat 8.92 

442320 AI267817 H&129636 ESTs 8.92 

449673 AA0Q2064 Hs.18920 ESTs 851 

411486 N85785 Hs.181165 eukaryotic translation elongation factor 850 

10 437916 BB66249 H&20999 Homo sapiens cDNA: RJ231 42 fis, clone L 850 

442732 AA257161 Hs£658 hypothetical protein DKFZp434E0321 8JB9 

419741 NMJH7019 Hs530Q2 ublquftin carrier protein E2-C 859 

411499 AW849292 gWL3^T021 5^)20300^)90- E06 CT0215 Homo 859. 

431154 AW971228 Ks290259 ESTs 859 

15 414922 D00723 Hs.77631 glycine deavage system protein H (amino 858 

418036 Z37976 Ks53337 latent transforming growth factor beta b 857 

406422 857 

422926 NNL016102 Hs.121748 ring finger protein 16 857 

435220 D50030 Hs.104 HQF activator 856 

20 418203 X54942 Hs53758 COC28 protein kinase 2 856 

418613 AA744529 Hs56575 mitogen-activated protein kinase kinase 855 

439250 H66566 H&271711 ESTs 855 

432359 AA076049 H&274415 Homo sapiens cONA FU1Q229 Rs, done HE 854 

450000 AI952797 Hs. 10888 Homo sapiens cDNA:FU21559 fis, clone C 853 

25 425657 T89839 Hs.119471 ESTs 853 

425694 U51333 Hs.159237 hexokinase 3 (white cett) 852 

419972 AL041465 H&294038 ESTs, Moderately similar to ALU2_HUMAN A 852 

436396 A1683487 H&299112 Homo sapiens cDNA FU1 1441 fis, done HE 852 

413413 D82520 Hs501834 Homo sapiens cDNA FU10952 fls, done PL 852 

30 428807 AA435997 Ha.104930 ESTs 852 

415839 R40611 Hs.137565 ESTs 851 

419553 N34145 H&250614 ESTs 850 

420309 AW043637 H&21766 ESTs 850 

421683 A1952877 Hs.108972 Homo sapiens mRNA; cDNA DKFZp434P228 (fr 850 

35 447965 AW292577 Hs.94445 ESTs 850 

459172 BB063380 gb:PM0^T0275-291099OQ2-g10 BT0275 Homo 850 

403259 8.78 

411534 AW850473 gb:IL3^T0219-280100O61-B11 CT0219Homo 8.78 

456161 BE264645 H&282093 Homo sapiens cDNA: FU21918 fis, done H 8.77 

40 413654 AA331881 Hs.75454 peroodredoxln 3 8.76 

401744 8.78 

425348 AL137477 Hs.155912 cadherin-Qte 24 8.76 

4233% AI382555 Hs.127850 bronwdorramH»rrtafeiing 1 8.75 

450649 NM.001429 H&297722 Human DNA sequence from done RP1-85F18 8.75 

45 408331 NM.007240 Hs.44229 dual specificity phosphatase 12 8.74 

423872 AB020316 Hs.134015 urony! 2-sutfotransferase 8.74 

424906 AI566086 Hs.153716 Homo sapiens mRNA tor Hmob33 protein. 3 1 8.74 

427596 AA449506 Hs.179765 Homo sapiens mRNA; cONA DKFZp585H1921 (f 8.73 

432488 AA551010 H&216640 ESTs 8.72 

50 448980 AL137527 H&22703 Homo sapiens mRNA; CDNA DKFZp434P1018 fj 8.72 

429455 A1472111 H&292507 ESTs 8.71 

429855 AW385597 Hs. 138902 ESTs, WeaWy similar to B34087 hypotheti 8.71 

441746 H59955 Hs.127829 ESTs " 8.70 

411945 AL033527 Hs.92137 v-myc avian rrryetocytomatosis viral oncog 8.70 

55 413492 D87470 Hs.75400 KIAA0280 protein 8.70 

435706 W31254 Hs.7045 GU)04 protein 8.70 

433741 AA609019 Hs.159343 ESTs 8.70 

426340 Z97989 Hs. 169370 FYN oncogene related to SRC, FOR, YES 859 

422779 AA317038 Hs.41989 ESTs , 857 

60 449785 AJ225235 Hs288300 Homo sapiens cDNA: FLJ23231 fis, done C 8.67 

420144 AA811813 Hs.1 19421 ESTs 8.66 

420235 AA256756 Hs51178 ESTs 856 

432606 NMJW2104 Hs5066 granzyme K (serine protease, granzyme 3; 856 

425762 BE244076 Hs.159578 Homo sapiens mRNA for FU00020 protein, 855 

65 427448 BE246449 H&2157 Wiskott-Aldrich syndrome (eczema-flirombo 854 

418033 W68180 H&259855 Homo sapiens cDNA FU12507 fis, done NT 854 

429084 AJ001443 Hs.195614 spndngfactor3b l aibuntt3 t 130kD 8.64 

417094 NMJJ06895 Hs51182 histamine r^ftyftransferase 8.64 

457277 NMJ004736 H&227656 xenoiropfc and potytroptc retrovirus roc 853 
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422631 BE218919 Hs.1 18793 hypothetical protein RJ1Q688 6.63 

410670 AW795196 Ha215857 ring finger protein 14 6.63 

431585 BE242803 Hs262823 hypothetical protein FU1G326 6.62 

401851 852 

5 401866 8.62 

407763 AW996872 Hs.1 72028 a disfntegrin and metaUoprotBinase doma 8.62 

408242 AA251594 Hs43913 PIBF1 gene product 8.62 

422250 AW408530 Hs.1 13823 CtpX (caseinofytic protease X, E coli) 8.62 

430259 BE550182 Hs.127826 RalGEF-Eke protein 3, mouse homotog 8.62 

10 452598 AI831594 Hs-68647 ESTs, WeaWy similar to ALU7_HUMAN ALU S *62 

419541 AW749617 gbflC3^T0502-1301Q(M)12-Q07 BT0502 Homo 8.60 

428839 AI767756 H&82302 ESTs 8.60 

429328 AA828402 Hs.47939 ESTs 8.60 

451491 AI972094 H&286221 Homo sapiens cONA HJ1 3741 fis, done PL 8.60 

15 452561 AI692181 Hs49169 KIM1 634 protein 8.60 

420027 AP009746 Hs.94395 ATP-binding cassette, sub-family D (ALD) 8.60 

435205 X54136 Hs.1 81 125 immunogSobutln lambda focus 8.60 

430900 U91939 Hs£48123 Q protein-coupled receptor 25 8.60 

405074 8.59 

20 437991 A1479773 Hs.181679 ESTs 859 

436346 BE328882 Hs.193096 ESTs, Moderately similar to U1 19_HUMAN U 858 

411079 AA091228 gbxchn2152^eqP Human fetal heart, Lam 857 

418452 BE379749 H&85201 C-type (calcium dependent carbohydrate- 856 

429109 AL008637 Hs.1 96352 neutrophD cytosolic factor 4 (40kD) 856 

25 448019 AW947164 Hs.195641 ESTs 856 

449865 AW204272 Hs.199371 ESTs 8.55 

431180 H55883 gb:yq94h03.r1 Scares fetal liver spleen 854 

445988 BE007663 H&13503 Inacfivafion escape 2 854 

405876 854 

30 407235 D20569 Hs.169407 SAC2 (suppressor of actin mutations 2, y 854 

414607 AI738616 Hs.77348 rrydroxyprostaglandln dehydrogenase 15-(N 854 

425671 AF1 93612 Hs.159142 lunatic fringe (Drosophiia) homolog 854 

452413 AW082633 Hs£12715 ESTs 854 

421620 AA446183 Hs.91885 ESTs 853 

35 444539 AI955765 Hs.146907 ESTs 852 

415102 M31899 Hs.77929 excision repair cross-cornplementing rode 851 

405552 851 

416068 AW971155 H&293902 ESTs, WeaWy similar to prolyl 4-hydroxy 850 

420133 AA426117 Hs.14373 ESTs 850 

40 438867 R68857 H5L265499 ESTs 850 

446468 AI765690 Hs.16341 ESTs; Moderately similar to Ml ALU SUB 850 

446585 AV659397 Hs£82948 ESTs 850 

441696 AW891873 flb.CM3-WT009O<)40500-17W)Q2 NT009O Homo 850 

437718 AI927288 Hs.196779 ESTs 8^8 

45 420656 AA279098 Hs.1 67636 ESTs 8.48 

429303 AW137635 Hs.44238 ESTs 8.48 

450624 ALA43983 Hs.1 25063 Homo sapiens cONA RJ1 3825 fis, clone TH 8.48 

452573 AI907957 H&267622 Homo sapiens cDNA FU14082 fis, clone HE 8.48 

456341 AA229126 Hs.122647 N^istoyftransferase 2 8.48 

50 423024 AA593731 Hs.75613 C036 antigen (collagen type I receptor, 8.47 

446985 AL038704 Hs.156827 ESTs, WeaMysimflar to ALU1JHUMAN ALUS 8.46 

431778 AL080276 H&268562 regulator of Gisrotetn signalling 17 8.46 
400268 * 8.46 

421828 AWB91865 H&289109 dimetrr/targirtfne dimefrylamlnohydrotase 8.45 

55 417022 NM.014737 Hs.80905 Ras assocfefon (RaIGDS/AF-6) domain fam 8.44 

421029 AW057782 Hs^93053 ESTs 8.44 

425171 AW732240 Hs500615 ESTs 8.44 

459070 AI814302 gbw|71c12JCl NCLCGAP_Lu19 Homo sapiens 8.42 

406006 8.42 

60 412643 AW971239 Hs.293982 ESTs 8.42 

424775 AB014540 Hs.1 53026 SWAP-70 protein 8.42 

446648 AW136083 Hs.195266 ESTs, WeaWy simiiar to S59501 interfero B.42 

448043 A1458653 Hs£01881 ESTs 8.41 

407183 AA358015 gb£ST66864 Fetal tung III Homo sapiens 8.40 

65 412324 AW978439 Hsj69504 ESTs 8.40 

419594 AA013051 Hs*1417 topoisomerase (DNA) II binding protein 8.40 

430968 AW972830 gb:EST384925 MAGE resequences, MAGL Homo 8.40 

431689 AA305688 H&267695 UDP-GaJfcetaGlcNAc beta 1,3-galactosyttr &40 

438582 AI521310 H&2833K ESTs, WeaWy similar to ALU5_HUMAN ALU S &40 
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447685 AL122043 Hs.18221 hypotheteal protaln DKFZp566G1424 8.40 

459119 AWB444S8 Hs389052 Homo sapiens LBIG8 mRNA, variant C, part 838 

400817 837 

425265 BE245297 gb:TCBAP1E2482 Pediatric pre-BceHacut 837 

5 409385 AA071267 gb:zm61g01.r1 Stratagene fibroblast (937 838 

439121 BB047779 HiL44701 ESTa 838 

419968 X04430 Hs.93913 InterteuWn 6 (interferon, beta 2) 838 

408327 AW182309 H&249963 ESTs, Highly stmDar to dJ1170K44[H^a 8.35 

403978 8.34 

10 448064 AA379038 gb£ST91 809 Synovial sarcoma Homo saplen 833 

442914 AW188551 H&99519 Homo sapiens cONA RJ14007 fis, done Y7 833 

428032 AW997704 Hs, 11493 Homo sapiens cONA RJ 13536 fis, done PL 832 

434194 AF1 19847 H&283940 Homo sapiens PR0 1550 mRNA, partial cds 832 

458677 AW937670 Hs254379 ESTs 832 

15 420925 NM.015698 H&100391 754 protein 830 

416475 T7Q298 gbryd26g02^1 Soares fatal Ever spleen 830 

416852 AF283776 Hs-80285 Homo sapiens mRNA; cONA DW=Zp588C1723 (f 830 

430676 AF084866 gb:Homo sapiens envelope protein R1C-3 ( 830 

428455 A1732694 HS38520 ESTs 829 

20 435343 AW194962 Hs. 199028 ESTs 839 

450783 BE266695 gb:601 19Q242F1 N1H_MGC_7 Homo sapiens CD 839 

404946 828 

422942 AF054839 H&122540 tetraspan2 838 

453716 AA037675 Hs.152675 ESTs 838 

25 437098 AA744488 Hs.132842 ESTs, Moderately similar to ALULHUMAN A 838 

443907 AU076484 Hs3963 TYRO protein tyrosine kinase binding pro 837 

401930 AF106069 Hs33168 ubiquitin specific protease 15 836 

446554 AA151730 ' Hs.301789 ESTs, WeaHy similar to slmDar to C.ete 836 

426290 AB007918 Hs.169182 KIAA0449 protein 835 

30 419904 AA974411 Hs.18672 ESTs 835 

413886 AW958264 H&103832 ESTs, WeaWy similar to TRHY.HUMAN TRICH 834 

424738 AI963740 Hs46828 ESTs 834 

427359 AWO20782 Ha.79881 Homo sapiens cONA: RJ23006 fis, done L 834 

424534 D87682 Hs.150275 WAA0241 protein 834 

35 424429 U63830 Hs.146847 TRAF family member-associated NFKB acfiv 834 

442604 BE263710 Hs379904 ESTs 832 

442992 AI914699 Hs.13297 ESTs 832 

427210 BE396283 Hs.173987 eukaryotic translation Initiation factor 832 

457229 BE222450 Hs368390 ESTs 831 

40 423730 AA330214 gbfST33935 Embryo, 12 WBek II Homo sap] 831 

411928 AA888624 Hs.19121 adaptor-related protein complex 2, alpha 830 

416051 AAB35868 Hs.25253 Homo sapiens cONA: FU20935 fis, done A 820 

417231 R40739 H&31326 ESTs 830 

422049 W25760 Hs.77631 grydne cleavage system protein H (amino 830 

45 427528 AU077143 Hs.179565 mlnWuomosome maintenance deficient (S. 830 

458776 AV654978 Hs.19904 cystathlonasa (cystathionine gamma-lyase 8.19 

417687 A1828596 HS350691 ESTs 8.18 

423218 NIvL0t5896 Hs.167380 BLu protein 8.18 

425397 J04088 H&156346 topoisomerase (DNA) II alpha (170kD) 8.18 

50 406964 M21305 Hs347946 Human alpha satellite and satellite 3 )u 8.18 

402401 U42349 H&71119 Putafive prostate cancer tumor suppresso 8.18 

423397 NM 001838 Hs.1652 chemokine (C-C motif) receptor 7 8.18 

427857 AL133017 Hs3210 thyroid hormone reoeptor interactor 3 - 8.17 

401519 8.17 

55 447188 H65423 Hs.17631 Homo sapiens cONA FU201 18 fis, done CO 8.16 

424704 A1263293 Hs.152096 cytochrome P450, subfamily IU (arachido 8.16 

435854 AJ278120 H&4998 DKFZP564D1 68 protein 8.14 

448556 AW885606 H&5064 ESTs 8.14 

449217 AA278538 H 523262 riboraidease, RNase A tamlly, k8 8.14 

60 453124 AI139058 HS33298 ESTs 8.14 

442812 AI018406 Hs.131284 ESTs 8.14 

421129 BE439899 Hs39271 ESTs 8.14 
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TABLE 9A shows the accession numbers for those primekeys lacking a unigenelD in Table 
9. For each probeset we have listed the gene cluster number from which the oligonucleotides 
were designed. Gene clusters were compiled using sequences derived from Genbank ESTs 
and mRNAs. These sequences were clustered based on sequence similarity using Clustering 
and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers 
for sequences comprising each cluster are listed in the "Accession" column. 



10 Pkey: 

CAT number: 



Unique Eos probeset identifier number 
Gene duster number 
Genbank accession numbers 



15 Pkey CAT number Accession 



30 



35 



40 



45 



50 



55 



60 



65 



20 



409126 
409292 
409314 
25 409385 
409398 



409671 
409768 
409841 
409842 
409853 
410531 
410688 
410846 



410896 1226053J 



408057 1035720.-1 AW139565 

408069 103655J H81785Z42291 R20973 AA048920 

408182 10447BJ AAD47854 AA057506 AA053841 

1052148J AW867079 AW867086 AW182772 

108463 1 BE540279 AW410659 AA057857 R77693 BE278674 

110159 1 AAD63426AW962323AW408063AA063503AA772927AW^3492BE1 75371 AA311147 

111586 1 AA071051 AA070584 AA069938 AA102136 AA074430 

111841 1 AA070266 AA0849S7 AA126998 

112523 1 AA071267 T65940 T64515 AA071334 

1126716 1 AW386461 AW876408 AW386672 AW386599 AW876258 AW386619 AW386289 AW876136 AW876203 AW876213 AWB76301 
AW876295 AW876349 AW876365 AW876160 AW876369 AWB76352 AW876271 

114731J AA076769 AA076781 A1087968 m 

1154035 1 AW499566 AW502378 AW499522 AW502046 AW502671 AW501917 AW501868AW501721 AW502813 

1 156038 J AW502139 AW502432 AW502235 AW501 683 AW502647 

1156119 1 AW501756 AW502096 AW502465 AW501715 

1156226~1 AW502327AW502488 AW501829 AW502625 AW5Q2687 

1207200J AW752953 H88044 BE156092 

1216101 1 AW796342 AW78S356 BE161430 

" AW807057 AW807054 AW807189 AW807193 AW807369 AW807429 AW807364 AWB073S5 AW807078 AW807256 AWB07180 
AW807331 

AW809637 AW809697 AWB1 0554 AW809707 AW809885 AW81 0000 AWB1 0088 AW809742 AW809816 AWB09749 AW809639 
AW809722 AWBQ9836 AW809774 AW81 0023 AW810013 AWB09813 AW809660 AW809728 AW809768 AW809951 AWB09657 
AW809954 

411079 123128 1 AA091228 H71860 H71073 

411424 1245497 1 AWB45985 AW845991 AW845962 

411499 1248105 1 AW849292 AW849431 AW849422 AW849428 AW849420 AW849424 AW849427 

411507 1248607J AW850140 AWB50195 AW850192 

411534 1248827J AW850473 AWB50471 AW850431 AW850523 

411972 1268491J BB074959 AW880160 

412110 1277844.1 AW893569 AW893571 AWB93588 AW893593 

412226 1284289.1 W26786 AW998612 AW902272 

412257 1285376 1 AW9O3830 BE071916 

412405 1293012"l AW948126 AW948139 AW948196 AW948145 AW948162 AW948134 AW948127 AW948124.AW948153 AW948157 AW948125 

AW948131 AW948158 AW948164 AW948151 

413260 1356003 1 BE075281 BE075219 BE075123 BB075119 BE075046 

413471 1371778J BE142098 BE142092 
413729 1385114 1 BE159999 BE160056 BE160107 BE160139 

414182 142409.1 AA1 36301 AI381776AA1 36321 

414989 1511339J T81 668 C19040C1 7569 
415354 1534763.1 PQ6495R24336R13046 
416011 1566439.1 H14487 R50911 Z43216 
416475 1596398.1 T7Q288 H58072 R02750 

417380 1672461.1 T06809 N75735 
419392 1843934.-1 W28573 

419541 185724.1 AW749617 R64714 AA244138 AA244137 BE094019 
419544 185760.2 AI909154 AA526337 AA244193AI909153 
420819 196721.1 AA280700 AW975494 AA687385 
421245 200620 1 AA285363 AA285333 AA285&9 AA285326 AA285350 
422673 219674~1 N58027 AA314694 N53937 R08100 
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422858 
422940 
423730 
423790 
424385 
424606 
425265 



430676 



431180 
432093 
434596 



437159 
4374% 
439097 
439120 
440134 
441896 
445629 
447229 
448064 
4507B3 
451045 
452549 
452560 



219998.1 AA315158 AW881298 N76067 AW802759 AI858495 W04474 

222209 J R35398 BE252178 AA318153 

223106.1 BE077458 AA337277 AA319285 

231463.1 AA330214 AW982519 T54709 

232031J BE152393 AA330984 BB073904 

238731.1 AA339666AW952809AA349119 

241409.1 AA343936 AA344060 AW963081 

249175.1 BE245297 AA353976 AW505Q23 

273830.-1 BE262745 

32168.1 AF084866 ARS4870 AF084854 AF084867 AF084869 AF084865 AF084868 AW81 8206 AW812038 BE144813BE144812 
AW812041 AW812040 AW8 12067 BE061583 BE061604 T05808 AI352469 AA5B0921 BE141783 BE141782 BB061601 
AW814393AWB85029 
AW972830 AA527B47 AA489820 AA570362 
H55883 AW971249 AA493900 H55788 
H28383 AW972S70 H28359 AA525808 
T59538 T59589 759598 T59542 AF147374 
AJ132085 283805 
AL050072AW900148 
BE177778 BE177779 AL390180 AA359908 
H66948AF085954 K66949 
H56389AF085977K56173 

BE410734 BE5601 17 BE270054 BE296330 BE267957 AI003007 BE545259 
AW891873 AW891897 BE564764 
AI245701 BE272724 
BBB17135 AW504051 AW504283 
AA379036 AA150589 AJ696854 BE621316 
BE266695 BE265474 N53200 BE267333 
AA215672 A1696628 AA013335 H88334 AAO17O06 
AI907039AKXJ7081 

BE077084 A W1 39963 AW863127 AW806209 AW806204 AW806205 AW806206 AW80621 1 AW806212 AW806207 AW806208 
AW806210AI907497 

AW838616 AW838660 BE144343 AI914520 AW88891 0 BE184854 BE184784 
U83527 AL120938 U83522 

AW860158 AW862385 AW860159 AWB62386 AW862341 AW821869 AW821893 AW062660 AW062656 
AW807530 AW807540 AW807537 AW846086 BE141634 AW846089 AW807499 AW807533 AWB38499 
BB071874 BE071882 AW820782 AW821007 

AW848032 AW848830 AWB48478 AW848623 AW848484 AW848169 AW848830 AW848149 AW8481 19 AW848893 AW848903 
AWB48407 

AW857913 AW857B16 AW857914 AW861G27 AW861626 AW861624 
AW984111 AW863918 AWB83856 

AW877015 AWB77133 AW876978 AW877071 AWB76988 AW877069 AW877063 AW877013 

AW903533 AW903516 AW903562 BE085202 BE085215 BE085214 BED85209 BE085172 BE085175 BE085193 BE085211 
BE085199 

BE176862 BE176876 BE176947 BE176878 

BE243628 BE246081 BE247016 BE241984 BE241534 BE246091 BE245679 BE243520 BE245998 BE242329 BE241417 
BE241457 BE242522 BE241989 BE241464 
1416335.1 R00028BE247630 
360505.1 AW062439 AW751554 AA579463 
364225.-1 AA584854 
399422.1 AI908238AA663731 
883688.1 AI814302AI814428 
889426.1 W07808A1822066 
918957.1 AI903354A1903489AI903488 
921149.1 BE063380 BE063346 AI806097 
945240.-1 AJ940425 



J 

341283.1 

38937.1 

41842L1 

43393.1 

43765.1 

46858J 

46879.1 

48575.1 

52842.1 

645767J 

71288.1 

74761.1 

84655.1 

85673.1 

921802.1 

922216.1 



452712 928309.1 

453758 980026.1 

454093 1007366.1 

454563 1224342.1 

454791 1234759.1 

454977 1247099.1 

455131 1254674J 

455183 1259023.1 

455254 1266449.1 

455369 1285173.16 

455982 1395849.1 

456011 1410860.1 



456023 
457586 
457595 
457751 
459070 
459081 
459145 
459172 
459234 
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TABLE 9B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 9. For each predicted exon, we have listed the genomic sequence 
source used for prediction. Nucleotide locations of each predicted exon are also listed 

Pkay: Unique number corresponding to an Eos probeset , 

Ref: * Sequence source, The 7 digit numbers In this column are Genbank Identifier (QI) numbers. "Dunham LetaL" refers to the 

publication entitled "The DNA sequence of human chromosome 22." Dunham LetaL, Nature (1999) 402:489495. 
Strand: Indicates DNA strand from which axons were predicted 

NUttsHion: Indicates nucleotide positions of predicted exons. 



Ptey Ref 


Strand 


NLposltion 


400452 8113550 


Minus 


80308-90505 


400557 9801261 


Pius 


208453-208528,209633-209813 


400615 9908994 


Plus 


1 18036-1 18166,1 18681-118807 


400802 8567867 


Minus 


174571-174856 


400817 8569994 


Pius 


170793-170948 


400880 9931121 


Plus 


29235-28336,36383-36580 


400885 9958187 


Minus 


58242-58733 


400926 7651921 


Minus 


52033-52158^3956^4120 f 54957-65052^5420-55480^6452-56666 t 57221 -5771 6 


400952 7658481 


Plus 


1 92667-1 93J26.1 94387-1 94876 


400991 8096825 


Plus 


159197-159320 


401044 8117619 


Plus 


73501-73674 


401124 8570296 


Minus 


124181-124391 


401163 6981820 


Plus 


5302-5545 


401201 9743387 


Minus 


138534-138629,139234-139294,140121-140335,142033-142479 


401286 9801342 


Minus 


147036-147318 


401384 6850939 


Minus 


58360-58545 


401468 6433826 


Plus 


13056-13482 


401515 7630851 


Plus 


29929-30126 


401519 6649315 


Plus 


157315-157950 


401672 9838136 


Plus 


128526-128704,130755-130860 


401744 2576349 


Plus 


14595-14751 


401851 7770425 


Minus 


146443-146664,147794-147971,148351-148480,148980-149111,149801-149949 


401866 8018108 


Plus 


73126-73623 


402240 7690131 


Plus 


104382-104527,106136-106372 


402359 9211204 


Minus 


4040341961 


402585 9908890 


Minus 


174893-175050,183210-183435 


402788 9796102 


Plus 


98273-101430 


402802 3287156 


Minus 


53242-53432 


402812 6010110 


Plus 


25026-25091,25844-25920 


402628 8918414 


Plus 


69071-69642 


402635 9187337 


Plus 


26961-27101 


402638 9369121 


Minus 


32589-32735,35478-35666 


402842 9369121 


Minus 


76355-76479 


402895 9967547 


Plus 


85537-85671,86379-86469 


402964 9581599 


Minus 


46624-46784 


403137 9211494 


Minus 


92349^2572^2958-93084^3579-93712^3949^4072^4591 -94748,95214-95337 


403237 7637807 


Plus 


7271-7527 


403259 7770585 


Plus 


46934857 


403683 7331517 


Plus 


217175-217446 


403690 7387384 


Minus 


78627-79583 


403708 5705981 


Minus 


134394-134812 


403838 4176355 


Plus 


19197-19502 


403851 7708872 


Pius 


22733-23007 


403976 7657840 


Pius 


24755-24969 


404407 7329316 


Minus 


4815448499 


404428 7407959 


Plus 


77842-77954 


404632 9796668 


Plus 


4509645229 


404741 B574139 


Plus 


143025-143467 


404756 7706327 


Phis 


82849-83627 


404946 7382189 


Pius 


134445-134750 


405074 7770440 


Plus 


4434044559,4479045059 


405125 8247873 


Plus 


137113-137814 


405172 9986752 


Phis 


153027-153262 
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405238 


7249076 


Minus 


151699-151915 


405325 


6094661 


Minus 


25818-26380 


405411 


3451356 


Minus 


17503-17778,18021-18290 


405495 


8050952 


Minus 


72182-72373 


405552 


1552506 


Pius 


4519945647 


405601 


5815493 


Minus 


147835-147935,149220-149299 


405685 


4508129 


Minus 


37956-38097 


405777 


7263187 


Minus 


104773-105051 


405856 


7653009 


Plus 


101777-102043 


405876 


6758747 


Plus 


3969440031 


405932 


7767B12 


Minus 


123525-123713 


405834 


6758795 


Plus 


159913-160605 


406006 


8247801 


Minus 


4264042776 


406134 


9163473 


Plus 


153291-153452 


406169 


7289992 


Minus 


22007-22234 


406422 


9256411 


Plus 


163003-163311 


406516 


7711422 


Minus 


128375-128449,128560-128784 


406538 


7711478 


Plus 


35196^5367^8229^8476^008040216,4352243840 


406554 


7711566 


Plus 


106956-107121 


406577 


7711730 


Plus 


11377-11509 
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TABLE 10: shows genes, including expression sequence tags differentially expressed in 
taxol resistant prostate tumor xenografts as compared to taxol sensitive prostate tumor 
xenografts. The genes are indicated as either being upregulated or downregulated during the 
induction of taxol resistance in sequential passages of the grafts. 



Pkey ExAccn 


UnigenetD UnlgenTUle Eos RespKO 


F00 


F02 


FQ2 


F05 


F05 


F07 


F09 


F10 


F11 


F13 


F14 


117921 N510Q2 


H&47170 UprinA2 PM28UP 1 


9 


8 


9 


32 


20 


34 


122 


105 


82 


71 


111 


112071 T17185 


Hs.4299 ESTs CHA1 down 290 


281 


267 


335 


270 


284 


150 


157 


83 


89 


49 


75 


126645 A1167942 


Hs.61635 STEAP PAASdown 106 


111 


103 


71 


34 


67 


33 


14 


2 


1 


1 


1 


119018 N95796 


Hs.179809 ESTs PAB2 down 765 


841 


757 


909 


742 


704 


47B 


428 


253 


175 


228 


238 


110844 N31952 


Hs.167531 ESTs PAV7down 175 


192 


147 


141 


123 


129 


73 


65 


55 


48 


54 


84 


100654 HG2841-H72969 Hs.75442 Albumin, A PM01 down 666 


605 


504 


728 


357 


445 


602 


187 


117 


127 117 113 


100655 HG2641-HT2970 Hs.75442 Albumin, A PM02down620 


653 


486 


688 


368 


386 


606 


175 


101 


95 115 97 


102078 U09579 


Hs252437 cycfin-dep PM03down 101 


94 


143 


190 


105 


107 


88 


40 


34 


31 


46 


22 


102208 U22981 


Hs.75442 albumin PM04down 495 


424 


323 


518 


252 


288 


467 


188 


169 


143 


165 


145 


103739 AA075779 


mltochondr PMOSdown 75 


190 


606 


230 


378 


106 


218 


88 


69 


192 


69 


99 


107036 AA599690 


Ha.15725 SBBI48 PM06down87 


124 


115 


168 


132 


111 


66 


71 


49 


70 


38 


50 


108242 AA062746 


ESTs PM07down 14 


20 


252 


13 


22 


43 


193 


10 


10 


104 


21 


18 


108282 AA055143 


solute car PM08down27 


54 


178 


73 


108 


37 


53 


24 


14 


53 


15 


34 


108679 AA115963 


beta-1-gto PM09 down 680 


693 


1292 656 


869 


389 


1 


74 


118 


662 


359 


409 


108731 AA126313 


Hs.107476 ATP syntha PMIOdown 10 


19 


165 


25 


60 


1 


32 


3 


7 


14 


1 


1 


110675 H89355 


Hs.6598 adrenergic PM11 down 207 


334 


237 


239 


231 


220 


119 


145 


93 


64 


56 


124 


115412 AA283804 


Hs.193552 ESTs PM12down 146 


316 


282 


271 


340 


334 


115 


238 


100 


196 


83 


207 


115844 AA430124 


H&234607 MDM2 PM13down49 


93 


94 


154 


132 


91 


23 


54 


23 


78 


14 


41 


120588 AA281591 


Hs.16193 ESTs PM14down80 


157 


58 


141 


159 


127 


39 


63 


35 


37 


16 


46 


132349 Y00705 


Hs.181286 serine pro PM15down146 


217 


214 


150 


106 


128 


177 


85 


54 


63 


66 


56 


132888 AA490775 


H&5920 N-acetytma PM16down 92 


150 


132 


178 


126 


139 


53 


94 


48 


67 


41 


60 


132967 AA032221 


H&61635 STEAP PM17down224 


208 


203 


215 


205 


180 


132 


65 


68 


50 


48 


63 


133063 AA283085 


H&64065 ESTs PM18down85 


148 


161 


150 


92 


108 


42 


99 


42 


65 


29 


126 


134374 D62633 


H&8236 ESTs PM19 down 230 


240 


194 


212 


231 


189 


89 


123 


107 


95 


68 


91 


135400 M23263 


Hsj99915 androgen r PM20down 36 


187 


99 


178 


132 


101 


23 


71 


26 


122 


14 


44 



ExAccn: 
UnigenelD: 



Pkey. 



Unique Eos probeset identifier number 
Exemplar Accession number, Genbank accession number 



Unigene Title: 
Eos: 
F0O-F14: 



Unigene number 
Unigene gene title 
Internal Eos name 
passage number 
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TABLE 1 1 : shows genes, including expression sequence tags that are up-regulated in 
prostate tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos 
HuOl GeneChip array. Shown are the ratios of "average" normal prostate to "average" 
prostate cancer tissues. 



Pkey: 
ExAccn: 
UnigeneiD: 
Unigene Tffle 
Rt: 



Unique Eos probeset identifier number 

Exemplar Accession number, Genbank accession number 

Unigene number 



Background subtracted normal prostate : prostate tumor tissue 





Pkey 


ExAccn 


UitigenetD 


10 


101336 


L49169 


Hs.75678 


130842 


M63438 


Hs.156110 




133512 


X01677 


Hs.195188 




133436 


H44631 


Hs.737 




129292 


X13810 


H&1101 


15 


100610 


HG2566-HT4782 




133448 


M34516 


H&170116 




125193 


W67577 


H&84298 




133456 


T49257 


Hs.183704 


20 


134546 


AA459310 


H&8518 




102131 


U15085 


Hs.1182 




101375 


M13560 


H&84298 


25 


100674 


HG3033-HT3194 




134365 


R32377 


Hs.82240 




132335 


D60387 


H&189885 




110303 


H37901 


Hs.32706 




131678 


N59162 


Hs.30542 


30 


116599 


D80046 


H&250879 


133769 


M17733 


Hs.75968 




107004 


AA026548 


Hs.61389 




129427 


T80746 


Ks.111334 




105987 


AA406631 


H&110299 


35 


131466 


F03233 


Hs27189 


102859 


X00274 


Hs.76807 




134626 


S82198 


Hs.8709 




134170 


M63133 


H&79572 




131713 


X57809 


Hs.181125 


40 


100748 


HG3517-HT3711 




118769 


N74496 






111734 


R25375 


Hs.126916 




109221 


AA192755 


H&85840 




133846 


AA480073 


Hs.76719 


45 


135281 


AA401575 


Hs.97757 


119073 


R32894 


Hs.45514 




100760 


HG3576-HT377B 






101426 


M19483 


Hs25 




129568 


AA428025 


Hs.114360 


50 


130900 


Z38468 


Hs21036 


133879 


M13829 


Hs.77183 




100627 


HG2702-HT2798 






129424 


M55593 


Hs.111301 


55 


128652 


AA621245 


Hs.103147 


129979 


T72635 


Hs.13956 




133468 


X03068 


Hs.73931 




102636 


U67092 




60 


129536 


M33493 


H&184504 


133599 


M64788 


Hs.75151 



Unigene Tffle 

FBJ murine osteosarcoma viral oncogene homobg B 

Immunoglobulin kappa variable 1 D-8 

gryceraUehyde-3imosphate dehydrogenase 

immediate early protein 

POU domain; class 2; transcription factor 2 

Microtubuie4ssodated Protein Tau, Alt Spliced, Exon 8 

immunoglobulin lambda-Oka polypeptide 3 

CD74 antigen (invariant polypeptide of major histocompatibility 

complex; class II antigen-associated) 

ublqutimC 

Homo sapiens mRNA; cDNA DKFZp586L1722 (from clone 
DKFZD586L1722) 

major histocompatibility complex; class II; DM beta 

CD74 antigen (invariant polypeptide of major histocompatibility 

complex; class U antigen-associated) 

SpBceosomal Protein Sap 62 

syntaxm3A 

ESTs 

ESTs 

ESTs 

ESTs 

thymosin; bete 4; X chromosome 
ESTs 

ferritin; light polypeptide 
mitogen-activated protein kinase kinase 7 
ESTs 

Human HIA-DR alpha-chain mRNA 

caldecrm (serum calcium decreasing factor; elastase IV) 

cathepstn 0 (lysosomal aspartyl protease) 

Immunoglobulin lambda gen8 clustar 

AIphar1-Antitrypsln f 5 , End 

ESTs 

ESTs 

ESTs; Weakly simBar to stac [H.sapiens] 
U6 srJWA-assodated Sm-iike protein 
ESTs 

v-ets avian erythroblastosis virus E26 oncogene rotated 

Major Histocompatibility Complex, Class 0 Beta W52 

ATP synthase; Hi- transprtng; mflochndri F1 complex; beta potypept 

transforming growth factor beta-stimulated protein TSC-22 

ESTs; Moderately stmBar to F25965J3 [Hsapiens] 

v-raf murine sarcoma 3811 viral oncogene homotog 1 

Serine/Threonine Kinase (Gb225424) 

matrix metalbproteinase 2 (geiatinase A; 72 kD geiatinase; 

72kD type IV coQagenase) 

ESTs; Weakty similar to similar to SP:YR40JACSU [Celegans] 
ESTs 

major histocompatibility complex; class II; DQ beta 1 
Human ataxia-telangiectasia locus protein (ATM) gene, exons 
1a, 1b, 2, 3 and 4, partial cds 



R1 

0XJ12 

0.015 

OJ017 

0.017 

0.019 

0.02 

0.021 

0.022 
0.022 

0.023 
0.023 

0X323 
0X324 
0.027 
0.027 
0.02B 



0.029 

0.029 

0.03 

0.03 

0.03 

0.032 

0.032 

0.032 

0.033 

0.034 

osm 

0.038 



0.037 
0.037 
0.037 
0.038 

osm 

0.039 
0.039 
0J039 

0j039 



RAP1; GTPase activating protein 1 
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0J04 

0.04 
0.04 
0.041 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



02104 
31340 
30446 
01352 
22593 
30181 
34071 

108129 

30511 

33336 



31880 
30540 
33467 
01191 
01860 
(2799 

07200 
01166 



24950 

02919 
00574 



I02675 

31332 
01634 
13118 
24884 
30523 
10244 
31932 



33372 
00817 
06746 
135401 
30479 
02589 
21521 
35340 



101278 



00564 
133132 
21811 
29613 
32468 
20111 



130386 
04275 
06305 
16431 



U12139 

AA47B305 

X79510 

L77701 

AA453310 

R39552 

214093 

AA053252 

L32137 

AA291456 

L02326 

AA047034 

U35234 

AA258595 

L20688 

M95610 



14427 
118821 
18979 
07495 
20240 



D20350 

114927 

M54915 

AA436Q26 

T03786 

X12447 

HG2279-HT2375 

AA450092 

U72512 

R50487 

M57731 

T47906 

R77276 

W76097 

H26742 

AA454980 

H09751 

AA291139 

HQ4011-HT4804 

AM76438 

L14813 

R44163 

U62015 

AA412165 

AA425137 

AA342422 

AA282133 

L38487 

X80200 

HG2239-HT2324 

Z40883 

AA424535 

AA279481 

S79854 

W95841 

Z83741 

F10874 

CQ2170 

AA436146 

AA609878 

AA206465 

AA017063 

N79070 

N93798 

W78776 

Z41732 



Hs£5817 

Hs.155693 

Hs.16297 

Hs.128749 

Hs.151608 

Hs.78950 

Hs.185848 

Hs.1584 

Hs.71190 

H&198116 

H&33818 

Hs.159534 

H&73931 

Hs.83656 

Hs.37165 



H&5628 

HS2099 

H&81170 

Hs.98858 

Hs.15t53t 

Hs.183760 

Hs25300 



HSJ25717 

Hs.75765 

H&220512 

Hs.120911 

H&214507 



H&25601 
H&5038 
Hs.72242 

Hs.7991 

Hs.169271 

Hs.12457 

HsJ867 

Hs.97358 

Hs.99093 

Hs.45073 

H&88960 

Hs.110849 

Hs.6375 



H&98416 

HS238831 

Hs.49322 

Hs.136031 

H&248174 

Hs234249 

Hs.39387 

Hs.12828 

Hs.55289 

Hs^56470 

Hs.94789 
H&43666 
HsJ0375 
Hs.66049 



Human alphal (XI) collagen (COL11 A1) gene, 5* region and axon 1 

Homo sapiens chromosome 19; cosmid R27216 

protein tyrosine phosphatase; non-receptor type 21 

COX17 (yeast) homolog; cytochrome c oxidase assembly protein 

aipha-metfiytacyK^oA racemase 

Homo sapiens done 23622 mHNA sequence 

blanched chain kato acid dehydrogenase E1 ; alpha polypeptide 

(mapte syrup urine disease) 

ESTs; Weakly similar to D ALU SUBFAMILY J WARNING 
ENTRY !1 [H sapiens) 

cartfege oBgomertc matrix protein (pseudoachondroplasia; 

epiphyseal dysplasia 1; multiple) 

ESTs 

immunoglobufin iamboa-iike polypeptide 2 
RecQ protein-like 5 

protein tyrosine phosphatase; receptor type; S 
major hlstocompatfoilfry complex; class II; DQ beta 1 
Rho GDP (association Inhibitor (GDI) beta 
coflagen; type DC; alpha 2 

Human endogenous retroviral H proteasaMegrase^rived ORF1 
mRNA, complete cds, and putative envelope prot mRNA, partial cds 
ESTs 

Dpocafin 1 (protein migrating faster than albumin; tear prealbumin) 

pim-1 oncogene 

ESTs 

protein phosphatase 3 (formerly 2B); catalytic submit; beta isoform 
(caicineurinAbeta) 



0.041 
0.041 
0.042 
a042 
0042 
0.042 

0.042 

0.043 

0.043 
0.043 
0.044 

om 
om 
om 
om 
om 

om 
om 



Triphosphate Isomerase 
Homo sapiens clones 24718 and 24825 mRNA sequence 
Human B-ceH receptor associated protein (hBAP) aitematfVely 
spliced mRNA, partial 3HJTR 
ESTs 

GRQ2 oncogene 
ESTs 
ESTs 
ESTs 

EST s; Weakly similar to ALR [Rsaplens] 
chromodomain heBcase DNA binding protein 3 
neuropathy target esterase 
ESTs 

Dystrophln-Associated Glycoprotein, 50 Kda, Att. SpOce 2 
ESTs 

carboxyi ester Upase-E© (bBe saH-sfimu fated Epase-fike) 
Homo sapiens done 23770 mRNA sequence 
cysteiiKHich; angiogenic inducer; 61 
EST 

Homo sapiens chromosome 19; cosmid R28379 
ESTs 

ESTs; Weakly similar to simBar to collagen [Cetegans] 

estrogen-related receptor alpha 

TNF receptor-associated fector4 

Potassium Channel Proteia(GbZ11585) 

ESTs; Weakly similar to dJ393P122 [Usapiens] 

ESTs 

ESTs; Weakly similar to collagen alpha 1 (XVIII) chain [M.muscu!u$] 

dekxfinase; boothyronine; type Oi 

ESTs 

H2A histone ramBy; member M 

mftogsn-ariivated protein kinase 8 interacting protein 1 

ESTs; Weakly smtr to weak smlrity to ribosomal prot L14 [Cetegans] 

ESTs 

ESTs; Weakly smir to 110 KD CELL MEMBRANE GLYCOPROTEIN [H.sapiens] 
EST 

ESTs; Highly similar to Miz-1 protein Qtsapiens] 
ESTs 

proteki tyrosine phosphatase type tVA; member 3 

ESTs 

ESTs 



0.044 
0.044 

0.044 
0.044 
0.045 
0.045 

0.045 
0.045 
0.046 

om 

0.046 

0046 

0.046 

0.048 

0.046 

0.046 

0.047 

<UW7 

0.047 

O047 

0.047 

0.048 

0.048 

0.048 

0.048 

0.048 

0.048 

0.048 

0.048 

0.048 

0.049 

0.049 

0.049 

0.049 

0.049 

0.049 

0X6 

0.813 

0.05 

0.05 

om 
om 

OJ051 
0.051 
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114331 


Z41309 


Ha.12400 




130947 


R40037 


H&21508 




129242 


W81679 


HSJ5174 




131413 


AA462390 


HS26510 


5 


112304 


R54798 


Hs.26239 




101416 


M17254 


Hs.45514 




131201 


AA426304 


Hs.24174 




101054 


KQ2405 


HS.73933 




101306 


L41143 


Hs232069 


10 


129311 


T55087 






129942 


U95301 


Hs.144442 




119210 


R93340 


Hs«92995 


15 


101046 


KOI 160 




114086 


Z38266 


Hs.12770 




110171 


H19964 


Hs317Q9 




101004 


J04101 


H&248109 




129715 


N58479 


Hs.12126 




101581 


M34996 


Hs.1 98253 


20 


113285 


T66830 


Hs.182712 




127537 


AA569531 


Hs.162859 




100813 


HG3995-HT4265 






101841 


M93107 


Hs.76893 


25 


135053 


R77159 


H&93678 


101419 


M17886 


HS.177592 




119724 


W69468 


Hs.47622 




102673 


U72509 






129877 


AA248589 


Hs.13094 


30 


114788 


AA156737 


Hs.103904 


123812 


AA620607 


Hs.1 11591 




117669 


N39237 


Hs.44977 




123782 


AA610111 


Hs.162695 




102395 


U41767 


Hs.92208 




133795 


M12529 


Hs.169401 


10Q1QO 
1^0 190 


/WtWttO 


Hs.1 38956 


132595 


AA253369 


Hs.155742 




104161 


AA456471 


Hs.7724 




115330 


AA281145 


Hs.88827 




112893 


T08000 


Hs.194684 


40 


133475 


L29217 


Hs.73987 




128699 


K03207 


Hs.1 03972 




102940 


X13958 


Hs£4998 




131299 


AA431464 


H&25426 


45 


102495 


U51240 


Hs.79356 


129594 


R70379 


Hs.1 15396 




110503 

1 10390 




H&207689 




126702 


U546Q2 


H&2785 




124386 


N27368 


H&212414 


50 


130538 


M20786 


Hs.159509 




114299 


Z40782 


H&22920 




115604 


AA400378 


Hs.49391 




106052 


AA416947 


Hs.6382 


55 


131730 


U05681 


Hs31210 


131285 


AA479498 


H&25274 




129705 


X78706 


Hs.12068 




123175 


AA489010 


Hs.178400 




103592 


Z30644 


Hs.123059 


60 


116198 


N59478 


Hs.48396 




104886 


AA053348 


Hs.144626 




104250 


AF000575 


Hs.105928 


65 


113301 


T67452 


Hs.13104 


110441 


H50302 


H&19845 


125297 


Z39215 


Hs.159409 




135258 


AA292423 


Hs47272 




130633 


T92363 


Hs.1 78703 




112006 


R42607 


H&22241 



ESTs 
ESTs 

rfoosoma] protein S17 

ESTs; Modly smlr to vacuolar pro) sorting homotog r-vps33b [Rnorvegicus] 
ESTs 

v-ets avian erythroblastosis virus E26 onoogene related 
ESTs 

Human MHC class II HLA-DQ-beta mRNA (DR7 DQw2); complete cds 

T-cell leukemia translocation altered gene 

yb45c08j1 Stratagene fatal spleen (#937205) Homo sapiens cDNA 

done IMAGE:74125 5\ mRNA sequence. 

phosphoppase A2; group X 

ESTs 

Accession not listed in Ganbank 

Homo sapiens PAC clone DJ0777O23 from 7p14-p15 

ESTs 

v-ets avian erythroblastosis virus E26 oncogene homolog 1 

ESTs; Weakly similar to LR8 [Haptens] 

major histocompatibility complex; class II; DQ alpha 1 

ESTs 

ESTs 

Cpg-Enriched Dna, Clone S19 

3-hydroxybutyrate dehydrogenase {heart; mitochondrial) 



ribosomal protein; large; P1 
ESTs 

Human alternatively spliced B8 (B7) mRNA, partial sequence 

ESTs; Weakly similar to ORF YGR101W [S^erevisiae] 

EST 

ESTs 

ESTs 

EST 

a dslntegrtn and metaDoproteinase domain 15 (metargidin) 

apoEpoprotein E 

ESTs 

gtyoxytate reductase/hydroxypyruvate reductase 
KIAA0963 protein 



bassoon (presynaptic cytomatrix protein) 
CDC-Gke kinase 3 

prafins-fich protein BstNl subfamily 4 

Hu 12S RNA induced by pory(r1); po!y(rC) and Newcastle disease virus 
ESTs; Weakly similar to unknown [H^apiens] 
LysosornaJ-assoclated muHlspanning membrane protein-5 
Human germEne IgD chain gene; C-region; C-deIta-1 domain 
EST 

kerafin17 

sema domain; Immunoglobulin domain (Ig); short basic domain; 

secreted; (semaphorin) 3E 

aJpha-2-plasmlnlnhfbtor 

similar to S68401 (cattle) glucose induced gene 

ESTs 

ESTs; Highly similar to KIAA0612 protein [H^apiens] 
B-C8ll CUVlyrnphorna3 

ESTs; Modly smir to putative seven pass transmembrane prat [H.sapfens] 

carnitine acatyftransf erase 

ESTs 

chloride channel Kb 

ESTs; Moderately similar to tumor necrosis factor-alpha 
•Induced protein B12 [H.saplens] 
growth differentiation factor 11 

leukocyte ImrruirroglobuMke receptor; subfamily B (with TM 

and ITIM domains); member 3 

EST 

ESTs; Highly smlrtoprot phosphatase 2A BR gamma suburb [H^apiens] 
ESTs 

ESTs; Weakly similar to dJ281H8.2 [H sapiens] 
ESTs 

hypothetical protein 

196 



0.051 
0.052 
0-052 
0452 
0.052 
0.052 
0.052 
0452 
0.053 

0.053 
0453 
0.053 
0453 
0453 
0453 
0453 
0453 
0453 
0453 
0454 
0454 
0.054 
0454 
0.054 
0.055 
0.055 
0455 
0455 
0.055 
0.055 
0455 
0455 
0455 
0.056 
0.056 
0456 
0456 
0.056 
0.056 
0456 
0.056 
0457 
0457 
0457 
0457 
0457 

0457 
0457 
0457 
0457 
0457 
0457 
0458 
0.058 
0458 
0.058 

0458 



0458 
0458 
0.058 
0458 
0.058 
0.058 
0.058 
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134907 
32619 
35115 
00531 
24530 



U12194 
D800Q2 
AA404565 



132793 
01076 
30655 
134458 
105904 



121628 
133418 
29317 
30153 
24403 
27663 
29814 
31770 
17557 
,03522 



102135 
123617 
12136 
133725 
102069 



123269 
109088 



29375 
135271 



129384 
123427 
105236 
101012 
134791 
133700 
123887 
129363 
105719 
24226 
17437 

132741 
134437 
107664 
120844 
01574 
31219 
03495 
29607 
06467 
28841 
00515 
18332 
34516 
35012 
03575 
15514 

03996 
10505 

133912 



HG1872-HT1907 

N62256 

W87533 

AA478999 

104270 

N92934 

AA192614 

AA401452 

AA026793 

AA425166 

U76366 

N46244 

085815 

N31745 

AA668123 

W20070 

D59682 

N33920 

Y10514 

W91960 

U15460 

AA609183 

R46100 

V00563 

U09196 

AA455000 

AA491226 

M166837 

AA263Q28 

W79850 

AA397763 

W90398 

AM77106 

AA598548 

AA219179 



L18983 

K01396 

AA621065 

H05704 

AA291644 



N27645 

AA394133 

M26041 

AA010594 

AA349417 

M34182 

C00476 

Y09022 

AA404594 

AA450040 

T16358 

HG1723-HT1729 
T54095 
AA171939 
X73608 



AA297739 

AA321355 
H55992 
X62744 
M33600 



Hs.170238 
Ha. 178292 
HS53447 
H&94653 

Hs.102727 

H&32699 

H&56966 

Hs.1116 

Hs.17409 

Hs.83577 

Hs32060 



H&98497 

Hs.172727 

Hs.1 10373 

Hs.15114 

Hs.102493 

Hs.134170 

Hs.1 68625 

H&31833 

Hs.44532 

H&250640 

Hs.41691 

Hs.181131 

Hs.9739 

Hs.1 79543 

H&82520 

Hs.16725 

Hs.105280 

Hs.72620 

Hs.111076 

Hs.11081 

H&07562 

Hs.6147 

Hs.110757 

Hs.112471 

Hs.1 91 05 

HsX97 

HsX9655 

Hs.75621 

Hs.1 12943 

Hs.110746 

HSJ38793 

H&190266 



Hs.1 98253 

H&S326 

H&96917 

Hs.158029 

H&24395 

Hs.153591 

Hs.11607 

Hs.154162 

Hs.106443 



H&23413 
H&93029 

H$j55609 



H^20495 
Hs.77522 
Hs.180255 



scxfium channel; voltage-gated; type I; beta polypeptide 0.058 

WAA0180 protein 0.058 

ESTs; Moderately similar to Wnesin light chain 1 [M-muscufajs] 0.058 

neurochondrto 0.058 

Major Histocompatibility Complex, Dg 0X58 

EST 0.058 

ESTs; Moderately similar to UV-1 protein [H.sapiens] 0.058 

KIAA0906 protein 0X58 

lymphotoxln beta receptor (TNFR supertamfly; member 3 0.058 

cysteine-rich protein 1 (Intestinal) 0X58 

cysteine and glyctrte-rteh protein 3 (cardiac UM protein) 0X58 

ESTs 0.059 

ESTs; Weakly similar to 4F2/CD98 Bght chain [M.musculus] 0.059 

ESTs 0X59 

Treacher ColBns-Franceschetti syndrome 1 0.059 

ESTs 0.059 

ras homotog gene family; member D 0.059 

ESTs 0.059 

ESTs 0X59 

KIAA0979 protein 0.059 

ESTs 0.06 

diublquftin 0.06 

Haptens mRNA for CD152 protein ~ 0X6 

sequence-specific slngle-strandeo^ONA-binding protein a06 

activating transcription factor B 0.06 

ESTs 0.06 

ESTs 0.061 

immunoglobulin mu 0.061 

Hu 1 .1 kb mRNA upregttd in retinote acid treated HL-60 neutropMte celts 0.061 

ESTs 0.061 

ESTs; Weakly stmQar to (U963K232 [H.sapiens] 0.061 

DKFZP43411 14 protein 0.061 

malate dehydrogenase 2; NAD (mitochondrial) 0.061 

ESTs; Weakly similar to HPBRll-7 protein [H sapiens] 0.061 

ESTs 0.061 

KIAA1075 protein 0.061 

DMA segment on chromosome 21 (unique) 2056 expressed sequence 0.061 

ESTs 0.061 

transtocase of inner mitochondrial membrane 17 (yeast) homolog B 0X61 

cytochrome c-1 0.062 

protein tyrosine phosphatase; receptor type; N 0.062 

protease inhibitor 1 (anti-eJastase); alpha- 1 -antitrypsin 0.062 

ESTs 0.062 

H sapiens HCR (a-helix coiled-col! rod homotogue) mRNA; complete cds 0.062 

ESTs 0.062 

ESTs 0.062 
ywSe&sl Weizmann Olfactory EpftheBum H sapiens cDNA done 

IMAGE255676 3* smlr to contains L1.t3L1 repetitive element '„ mRNA seq 0.062 

ESTs; Highly simBar to OASIS protein [Mmuscuius] 0.062 

major rustocompaMty complex; dass II; DQ alpha 1 0.062 

ESTs; Moderately similar to plm-1 protein fisaplens] 0.062 

ESTs 0.062 

protein kinase; cAMP-dependent; catalytic; gamma 0.062 

small Inducible cytokine subfamily B (Cys-X-Cys); member 14 (BRAK) 0.062 

Not56 (D. melanogastarHke protein 0.062 

ESTs a062 

AOP-ribosyiaQon factor-Bee 2 0.062 

ESTs 0.062 

Macrophage Scavenger Receptor, AIL Splice 2 0X62 
ESTs; Weakly similar to I! ALU SUBFAMILY J WARNING ENTRY II [H^apiens] 0.062 

ESTs 0.062 

sparcrosteonectin; ewev and kazaHike domains proteoglycan (testican) 0.063 
H sapiens isoform 1 gene for L-type calcium channel, exon 1 
ESTs; Weakly similar to ISOLEUCYL-TRNA SYNTHETASE; 

CYTOPLASMIC [Rsaplens] 0.063 

EST2393 Bone marrow Homo sapiens cDNAS* end, mRNA sequence 0.063 

DKFZP434F011 protein 0.063 

major hBtocompaffiiKy complex; dass II; OM alpha 0.063 

major hfetocompatibHay complex; dass II; OR beta 1 0.063 
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130139 
105817 
134658 
100306 
100277 

133116 
134809 
130319 
132057 
108334 

129763 
135112 
122269 
133082 
113213 

106226 

130192 

104894 

103508 

128474 

134012 

134536 

111714 

110521 

103282 

113921 

129331 

111316 

135138 

107289 

121405 

124965 

106595 

100106 

134715 

135367 

111533 

128509 

101030 

102753 

126991 

109583 

119241 

130569 

112928 

120495 

130931 



110697 
121183 
130953 
102218 
114181 
116581 
132498 
103788 
102459 

100373 
132717 
128863 
115193 
124558 
117225 
110665 



AA397825 
AA410617 



D42053 

D61259 

AA52148B 

X74794 

AA102489 

AA070473 

F10815 

T67464 

AM36856 

AA457129 

T5B607 

AA429290 

Y12661 

AA054087 

Y10141 

U40671 

AA417821 

AA457735 

R23146 

H57060 

XB0198 

W80730 

N93465 

N74597 

AA036794 

T10792 

AA406083 

T16275 

AA456933 

AF015910 

AA282757 

AA480109 

R08548 

R53109 

J05037 

U80226 

R31652 

F02322 

T12559 

AA156597 

T10316 

AA256073 

AA278412 

M87789 

H03387 

H93721 

AA400138 

U12707 

U24163 

238079 

D51287 

T87708 

AA096014 

U48936 

D79999 

AA203321 

087462 

AA262Q29 

N66048 

N20392 



Hs.150922 

Ha5307 

Hs.178009 

H&80598 

Hs.75890 



HsXQ998 

Hs.154443 

Hs.173484 



Hs.12373 
HsX4617 
HsX8910 
H&6455 



Hs.17719 

Hs.171014 

HS.18858 

Hs.100299 

H&237924 

HaXSO 

Ha£3466 

Hs.108268 

Hs.77628 

H&28355 

Hs.1 10453 

Hs.180535 

HsX5196 

Ha.172098 

Hs.88007 

Hs.106359 

Hs.174481 

Hs*9040 

H&9963 

Hs£51651 

H&247362 

Hs.76751 

H&821 

HS26135 

H&221382 

H&256441 

H&4302 

Hs.190626 

H&21346 

Hs.140 

H&241305 

HS20798 

Hs.97703 

HS2157 

Hs.75160 

H&8021 

H&82148 

HS5009B 

H&9527 



Hs.77225 

Hs.151696 

Hs.106674 

H&88218 

Hs.141605 

Hs.42848 

Hs.32757 



BCS1 (yeast homotogHDca 
synsptopodin 
ESTs 

transcription elongation factor A (Sil); 2 
sfte-1 protease (subfiDsM 1 ' " 
element binding proteins) 
ESTs 

WAAD128 protein ^ " 

minWiramosorne maintenance deficient (S. cerevisiae) 4 
ESTs 

zm7c8.s1 Stratagene neuroepilhelium (#937231) Homo sapiens cONA 
ctonB [MAGE2399 3*. mRNA sequence 
K1AA0422 protein 

ESTs; Weakly similar to predicted using Genefinder [Cetegans] 
ESTs 

RuvB(Eoo0homo!og)^e2 

ya94a02.s1 Stratagene placenta (#937225) Homo sapiens cONA clone 

IMAQE69290 3*. mRNA sequence. 

ESTs 

VQF nerve growth factor foducfole 

phospholipase A2; group IVC (cytosoEc; calcium-independent) 

Haptens DAT1 gene, partial, VMTR 



ESTs; Highly similar to CGI-69 protein [H sapiens] 
IMP (inosine monophosphate) dehydrogenase 1 
ESTs 
ESTs 

steroidogenic acute regulatory protein related 
ESTs 

ESTs; Highly similar to CGl-38 protein [H.sap4ens] 
ESTs; Weakly similar to mitogen inducible gene mig-2 (H^aplens] 
ESTs; Weakly similar to T20B123 [Cetegans] 
ESTs 
ESTs 
ESTs 
ESTs 

Homo sapiens unknown protein mRNA, partial cds 
prep ronoGcep tin 

TYRO protein tyrosine kinase binding protein 
EST 

dimethylarginine dimathylaminohydrolase 2 
serine dehydratase 
Human gs 
biglycan 
ESTs 
ESTs 

EST; Moderately simflar to CGM36 protein [H^apiens] 

ESTs 

ESTs 

ESTs; WeaWy simflar to F42C5.7 gene product [Celegans] 

immunoglobulin gamma 3 (Gm marker) 

estrogen-responsive B box protein 

ESTs 

ESTs 

Wiskott-Aldrich syndrome (ecezerna-throrribocytopenia) 
phosphofructokinase; muscle 
WAA1058 protein 
noosomal protein S12 

ESTs , , 

ESTs; Highly similar to HSPC013 [H.saplens) 

Human amtoride-sensifive epMal sodium channel gamma subunfi mRNA, 

5* end, partial cds 

ADP-ribosyttransferase (NAD+; poly (ADP-rfoose) pdymeraseHO® 1 

DKFZP727G051 protein t 

BRCA1 associated proteh-1 (ubiquffin carboxy-terminal hydrolase) 

ESTs 

ESTs 

ESTs 

ESTs 
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0.064 
0.064 
0.064 
0.064 

0.064 
0.064 
0.064 
0.064 
0.064 

0.064 
0X164 . 
0.064 
a064 
0.064 

0.065 
0.065 
0.065 
0.065 
0.065 
0jTJ65 
0.065 
0.065 
0.065 
0.065 
0.065 
0.065 
0.065 
0.065 
0.065 
0.065 
. 0X165 
0.065 
0X66 
0.068 
0.066 
0.066 
0.068 
0.068 
0.066 
0.067 
0.067 
OJ067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0.067 
0X167 
0.067 
0X67 
0X67 
0.067 
0X68 
0X68 

0.068 
0X68 
0X68 
0X68 
0X68 
0X69 
0X69 
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132905 


U70663 


Hs.182965 


105778 


AA348910 


Hs.153299 


134770 


R72079 


Hs£9575 


123097 


AA485869 


Hs.105671 


100750 


HG3523-HT4899 




125091 


T91518 




100756 


HG3565-HT3768 




113463 


T87768 


Hs.16439 


101119 


109708 


HS5253 


102286 


U31628 


Hs.12503 


135349 


D83174 


H&9930 


100991 




HsJ2085 


133675 


AA443720 


Hs.7551 


105422 


AA251014 


Hs.12210 


102932 


X13334 


Hs.75627 


119147 


R5B878 


Hs.65739 


104900 


AA055048 


Hs.180481 


133185 


AA481404 


H&6686 


115498 


AA290674 


Hs.71819 


121005 


AAS98332 


Hs47613 


124869 


R69088 


Hs£8728 


129154 


N23673 


Hs.108969 


112181 


R48295 




128251 


W87488 


Hs.141464 


134298 


J00116 


Hs.81343 


119745 


W70264 


H&58093 


131306 


AA232688 


Hs.25489 


107776 


AA018820 


Hs.221147 


134271 


AA199630 


Hs.184456 


101798 


M85220 




135402 


S76942 


Hs.99922 


118742 


N74052 


HsJ0424 


131867 


N64656 


H&3353 


102923 


X12517 


Hs.1063 


100775 


HG371-HT26388 




111020 


N54361 


HS.185726 


134224 


X80822 


Hs.163593 


124059 


F13673 


Hs.99769 


133972 


AA160743 


H&78019 


129681 


AA436009 


Hs.178186 


103065 


X58399 


Hs.81221 


124966 


T19271 


Hs.155560 


112270 


R53021 


HSJ203358 


116704 


F10183 


Hs.66140 


129890 


M13699 


Hs.111461 


127345 


AA972008 


Hs.166253 


112436 


R63090 


HS28391 


114531 


AA053033 


H&203330 


135122 


H99080 


Hs£4814 


103934 


AA281338 


HS.134200 


109363 


AA215369 


Hs.185784 


112647 


R83329 


H&33403 


127083 


Z44079 


Hs.91608 


133027 


AA402624 


HS.63238 


122086 


AA432121 


H&250986 


110405 


H47542 


H&33962 


128697 


AB002344 


Hs.103915 


112221 


R50380 


H&25670 


100478 


HG1067-HT1067 




115598 


AA400129 


Hs.65735 


132491 


AA227137 


H&4984 


101655 


M60299 




106018 


AA411687 


Hs.34737 


129683 


W05348 


Hs.158196 


134137 


F10045 


Hs.79347 


114008 


W89128 


Hs.19872 



DOM-3 (C. eiegans) homotog Z 

CD79B antigen (immunoglobuOn-associated beta) 

ESTs 

Proto-Oncogane C-Myc, Aft. SpCce 3, Off 114 

ye20f05.s1 Stralagene lung (#937210) H sapiens cONA done IMAGE: 

3* similar to contains Ahi repetitive eiement;contains MER12 repetitive element: 

mRNA sequence. 

Zinc Finger Protein (Gb:M88357) 

ESTs 

complement component 2 

interteuWn 15 receptor; alpha 

coQagen-binding protein 2 (coffigen 2) 

plasminogen activator inhibitor; type I 

ESTs; Weakly simSar to T25G3.1 [Cetegans] 

ESTs , 

C014 antigen 

ESTs 

ESTs; Weakly similar to ACROSIN PRECURSOR [rtsapiens] 
ESTs 

eukaryotic translation initiation factor 4E binding protein 1 
ESTs 

ESTs; Weakly similar to F55A12.9 [Celegans] " 
mannosidase; alpha; class 2B; member 1 

ESTs; WkJysmtr toll ALU SUBFAMILY J WARNING ENTRY t! [H.sapiens] 
ESTs 

collagen; type II; alpha 1 (primary osteoarthritis; spondyloepiphyseal 

dysplasia; congenital) 

ESTs 

ESTs 

ESTs 

ESTs; WWysmir toll ALU SUBFAMILY SX WARNING ENTRY U [H.saptens] 
Accession not listed in Genbank 
dopamine receptor D4 
EST 

Homo sapiens done 24940 mRNA sequence 
small nuclear rflwnucieoprotein polypepfide C 
Mucin 1, Epithelial, AIL Splice 9 
ESTs 

ribosomal protein L1 8a 
ESTs 

Homo sapiens ctone 24432 mRNA sequence 

ESTs; Weakly similar to WASP4amfly protein [H^apiens] 

Human 12-9 transcript of unrearranged imnwioglobulin V(H)5 pseudogene 

cahexfo 

ESTs 

EST 

ceruiopiasmin (ferroxidase) 

ESTs; Highly similar to WAA0476 protein [Hxapiens] 

ESTs 

ESTs 

ESTs 

Homo sapiens mRNA; cDNADKFZp564C186 (from clone DKFZp564C186) 

ESTs; Weakly similar to hypothetical protein [Haptens] 

ESTs 



0.069 
0.069 
0.069 
0.069 



0.069 



synuctein; gamma (breast cancer-specific protein 1) 

EST 

ESTs 

WAAQ346 protein 
ESTs 

Mucin <Gb:M22406) 
ESTs 

KIAA0828 protein 

Human aJpna-1 collagen type II gene, exons 1,2 and 3 
ESTs 

DKFZP434B103 protein 
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Major Histocompatfcffity Complex, Class D, Dr Beta 2 (Gb X65561) 
plasminogen acuVator inhibitor; type I 
axin 
ESTs 

metaltothioneln 3 (growth inhibitory factor (neurotrophic)) 
bigtycan 

complement component 3 

ESTs; WWy smlr to DAW SUBFAMILY SQ WARNING ENTRY Q [Usaplens] 
ESTs; WWy smtr to teucine-flch gliorna^nacfivated prot precursor jH.sapiens] 
activating transcription factor 3 
ESTs 
EST 

solute carrier family 1 (gRal high affinity gtutaratetr^^ 
ESTs; WeaWy similar to envelope protein [H^aplens] 
plectin 1 ; Intermediate filament binding protein; 500kD 
dipmerte toxin resistance protein required for diphftamide 
biosynthesis (Saccharomyces)-Iike 1 
ESTs; Moderately similar to Pro-a2(XI) (H^apiens] 
ESTs; WWy smlr to alternatively spliced product using exon 13A [H .sapiens] 
ESTs 

DKFZP434C212 protein 
ESTs 

ESTs; Weakly simBar to coded for by C. eiegans cDNA yk173c12S [Celegans] 
Spficeosomal Protein Sap 49 
ESTs 

hypothetical protein; expressed In osteoblast 
ESTs 

FK5064jtnding protein 1B(12£kD) 
ESTs 
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Homo sapiens clone 23700 mRNA sequence 
ESTs 

immunoglobulin superfamily containing teucine-rtch repeat 
Human Hox22 gene for a homeobox protein 
DKFZP434B0335 protein 
ESTs 
ESTs 
EST 
ESTs 

pregnancy-associated plasma protein A 
T-cefl rymphoma Invasion and metastasis 2 
ESTs; Mcxfly smlr to putative seven pass transmembrane prot [Ksapiens] 
blue cone pigment 
thymosin; beta 4; X chromosome 
receptor-interacting serine-threonine kinase 3 
ESTs; Weakly similar to UV-1 protein pisapiens] 
ESTs; WWy smlr to HIALU SUBFAMILY SB1 WARNING ENTRY ll^saptens] 
ESTs; Weakty smlr to llALU SUBFAMILY J WARNING ENTRY U [Rsapiens] 
ESTs 

Homo sapiens clone 25155 mRNA sequence 
ESTs 

DKFZP586K0919 protein 
ESTs 
ESTs 
ESTs 
ESTs 

ESTs; WeaWy similar to 11 ALU SUBFAMILY J WARNING ENTRY B [H^aplens] 
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ESTs 

sphingomyelin phosphodiesterase 2; neutra 
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128591 


AA255537 
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HS25197 
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X70083 


Hs£8414 



I membrane (neutral sphingomyelinase) 
KIAA0255 gene product 

Homo sapiens done 643 unknown mRNA; complete sequence 
ESTs 

interferon; gajnma-mducible protein 30 
ESTs 

K1AA0296 gene product 
ESTs 

ESTs; Moderately strnQar to KIAA0544 protein [H .sapiens] 

glycine receptor; beta 

ESTs 

even-skipped homeo box 1 (homotog of DrosophQa) 

ESTs; WeaWy simDar to sphlngosine Mnase [WLmuscuhis] 

ESTs 

ESTs 

EST 

EST; WeaJdy similar to hypoftettcal protein [Rsapiens] 

ESTs 

ESTs 

ESTs 

ESTs; Highly similar to HYPOTHETICAL PROTEIN KIAA0195 [Haptens] 

protein with pdyglutamine repeat 

ESTs ~ 

Human clone 23548 mRNA sequence 

lymphotoxin beta (TNF sjperfamBy; member 3) 

ESTs; Weakly similar to predicted using Genefinder [Celegans] 

ESTs; Weakly simSar to chondn>modulm-l precursor [Haptens] 

ProBne-Flich Protein Prb4, Allele 

netrin2(chicken)-likB 

Human DNA from chromosome 1 9-specfflc cosmld R30923; genomic sequence 

Human Bver GABA transport protein mRNA; 3' end 

ESTs 

K1AA0521 protein 
ESTs 

KIAA0081 protein 

extracellular matrix protein 2; female organ and adipocyte specific 
ESTs 

za56d02s1 Scares fetal liver spleen 1 NFIS Homo sapiens cDNA clone 

IMAGE296547 5 1 , mRNA sequence. 

ESTs 

thyroid hormone responsive SP0T14 (rat) homotog 
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tenascm R (restrictin; janusin) 
ribosomal protein L3-QkB 

solute carrier family 21 (prosteglandln transporter); member 2 
ESTs 

homeo box 06 

ESTs; Weakly similar toil ALU SUBFAMILY J WARNING ENTRY 0 [H^aplens] 



KIAA1Q21 protein 
ESTs 

ESTs; Weakly similar to DY3X [Celegans] 

EST; Highly similar to CMP-N-acetyineuramlnlc acid hydroxylase [H^aplens] 

zo8f 12^1 Stratagene neuroepithe5um NT2RAMI 937234 Homo sapiens 

cDNActone IMAGE567119 3\ mRNA sequence 

K1AA1 056 protein; JSAP1 homotog (mouse); JIP3 homotog (mouse) 

ESTs 

Interteuktn 8 

doQchyt-phosphate mannosyltransferase polypeptide 2; regulatory subunit 
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ESTs; Weakly similar to MiDer-Oleker lissencephaly gene pisapiens] 

ESTs; Weakly slmflar to O-Dnked GlcNAc transferase [H sapiens] 
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chaperonm containing TCP1 ; subunft 5 (epsOon) 
STIP1 homology and U-Box containing protein 1 
filamin C; gamma (actm-binding protein-280) 
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ESTs 0^81 
ESTs & 081 
EST01884 Fetal brain, Stratagene (catS936206) Homo sapiens cDNA 
done HFBCH10, mRNA sequence, 
gamma-amlnobutyricadd (GABA) B receptor; 1 
Phospholipid Transfer Protein 
ESTs 
ESTs 

transforming growfo factor; beta-induced; 68kD 
natural kffler oeH transcript 4 
spastic ataxia of Otarlevolx-Saguenay (sacstn) 
ESTs 
ESTs 
ESTs 
cadhenn 8 

ESTs; Weakly similar to PROUNE-R1CH PROTEIN MP-3 [M.musculus] 
ESTs 

hydroxysteroid (17-beta) dehydrogenase 3 

ESTs; Weakly similar to il ALU SUBFAMILY J WARNING ENTRY !1 [Haptens] 
ESTs 
ESTs 

ESTs; Weakly similar to Mouse 195 mRNA; complete ods [Mjnuscuius] 
Hunan mRNA for ornithine decarboxylase antizyme; ORF 1 and ORF 2 
rnicrotubu'e-associaied protein tau 
KIAA0148 gene product 

Homo sapiens BAC clone RG1 18D07 from 7q31 
ESTs 
ESTs 
ESTs 

rtbosomal proteh L18a 
amyloid beta (A4) precursor-like protein 1 
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ESTs; Modty smlr to tumor necrosis factor-alpha-induced prot B12 [H.sapiens] 
ESTs 
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transient receptor potential channel 7 
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ESTs 

Human complement Clq B-chaln gene, axon A+1 
ESTs 

thromboxane A synthase 1 (platelet; cytochrome P450; subfamily V) 
EST 

zm2d1 .si Stratagene corneal stroma (#937222) Homo sapiens cDNA clone 
IMAGES12947 3 1 sMar to TRS103281 E198281 THIOREDOXW 
REDUCTASE contains Alu repetitive element;, mRNA sequence 
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ESTs 
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ESTs 
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ESTs 

procollagen Oendopeptidase enhancer 
pescadfllo (zebrafish) homotog 1; containing B RCT do main 
ESTs; Weakly slmflar to GLUCOSE TRANSPORTER TYPE 5; 
SMALL INTESTINE [Ksapiens] 
acyt-Coenzyme A dehydrogenase; long chain 
Human mRNA for SB class!! histocompatibany antigen alpha-chain 
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EST8 

Human done W2-6 mRNA from chromosome X 
cufflnl 

zm89e5.s1 Stratagene ovarian cancer (#937219) H sapiens cDNA done 

WAGE54512 3" similar to gb:X14723 CLUSTER1N PRECURSOR 

(HUMAN);, mRNA sequence 

ESTs 

ESTs 

EST 

periodontal Bgament fibroblast protein 
ESTs 

transglutaminase 2 (C polypeptide; proteln^tutamlne 

^amma-glutamyflransferase) 

ESTs; Moderately similar to afiemafivery spliced product 

using exon 13A [H.saplensJ 

EST 

myosin; heavy polypeptide 8; skeletal muscle; perinatal 

ESTs; Weakly sMar to ION CHANNEL HOMO LOG RiC 

PRECURSOR [Mjnusculus] 

ESTs 

ESTs 

inhlbin; beta A (activin A; activin AB alpha polypeptide) 

ESTs; Weakly similar to coded for by C. elegansrcDNA yk173c1L5 (Cjetegans] 



ESTs 

Mucin 3, Intestinal (Gb:M55405) 

lymphocyte antigen 6 complex; locus E 

ESTs; Weakly similar to FAST kinase [H .sapiens] 

ESTs; Moderately similar to unknown [Rjiorveglcus] 

EST 

ESTs 

partner ot RAC1 (artapfin 2) * 
ESTs 

ESTs; Weakly slmfiar to ZINC FINGER PROTEIN 135 [Haptens] 

lethal giant larvae (Drosophila) homoiog 1 

gastrirweteaslng peptide receptor 

ESTs 

ESTs 

GTP binding protein 1 

ESTs; Weakly similar to B ALU SUBFAMILY J WARNING ENTRY I! [H sapiens] 

EST 

ESTs 

ESTs 

connective tissue growth factor 

ESTs; Weakly similar to similar to OTP-binding protein [Oelegans] 
amlnoacylase 1 

Rab gemnylgeranyttransfeiase; alpha subuntt 
EST 

nuclear cap binding protein 1; 80kD 
ESTs 

murine retrovirus Integration site 1 homoiog 

proMjran 

ESTs 

ESTs 

Human germOne IgD chain gene; C-rogton; C-delta-1 domain 
ESTs 

yf71a03.ii Scares breast 2NbHBst Homo sapiens cDNA clone IMAGE154166 

5* similar to gbll 1284 DUAL SPECIFICITY MiTOGEN-ACTIVATED PROTEIN 

KINASE KINASE 1 (HUMAN);, mRNA sequence. 

ESTs; Weakly similar to TERATOCARCINOMAOERIVH) GROWTH 

FACTOR 1 [H^apiens] 

adducin 1 (alpha) 

ESTs 

ESTs; Weakly similar to ZFOC1 gene product [H.sapiens] 

Human nicotinamide N-methyrbansfsrase gene, exon 1 and 5* flanking region 

coQagsn; type V; alpha 1 

spteen focus forming virus (SFFV) proviiaJ Integration oncogene spH 

ESTs 

ESTs 



0.084 
0.084 
0.084 



0.084 
0.084 
0.084 
0.084 
0.084 

om 

0J084 

om 

0.084 
0.084 

om 
om 

0.084 

om 

0.084 
0.084 
0.085 
0.085 
0.085 
0.085 
0.085 
OJ085 
0.085 
0.085 
0.085 
0.085 
0.085 
O085 
00)85 
0.035 
0.085 
0.085 
0.085 
0.085 
0.085 
0.085 
0J085 
0.085 
0.085 
0.085 
0.085 
0j085 
0.085 
0.085 
0.086 
0.086 
0.086 



0.086 

0.088 
0.086 
0.086 
0.086 
0.086 
0.086 
0.086 
0X86 
0.086 
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05265 
33035 
00768 
129338 
132789 



00721 



30645 
00751 
34550 



01446 
16287 
34034 
30860 



07537 



121288 
08844 
129874 
105139 
24789 
15923 
123640 
31607 
30064 



00109 
104642 
131752 
14727 



06218 
562 
219 
01187 
01513 
16454 
16171 
17500 
19978 
32005 
09914 
30370 
04262 
29708 



120884 
30404 
14072 
31470 
124573 
14717 
33806 
30470 
33182 
16038 



AI205718 

AAQ53248 

AA017356 

U48865 

W73859 

AA227941 

T15965 

HG3636-HT3846 

T56800 

W23781 

AA456309 

HG3355-HT3532 

R73150 

AA020942 

HG3527-HT3721 

M27161 

AA338646 

M213Q2 

AA487656 

X89267 

U6S061 

H04992 

Z20777 

AA488030 
AA085161 

AA401735 

AA132916 

AA406488 

AA164543 

R43803 

AA441929 

AA6092S2 

AA351409 

T67053 

AA127070 

H68077 

AJ000480 

AA004662 

AA453311 

AA132545 



084361 

AA428451 

R09567 

AA400606 

L20316 

M26210 

AA621071 

AA463434 

N31909 

W88S23 

D58231 

H05529 

M55265 



AA417181 

AA447545 

AA365356 

X72012 

Z38184 



N67935 

M131240 

M12759 

AA398552 

Z80787 

AA452572 



Hs.125416 

Ks.185182 

Hs.171900 

Hs.158323 

Hs.78061 

Hs^6088 



HS47Z74 
Hs£6876 
Hs£8831 

Hs.75270 
Hs.17200 

H&85258 

Hs£Q912 

Hs£6306 

Hs.155829 

Hs.78601 

HSL241395 

Hs.30499 

HS.9B57 

Hs.6845 



Hs.97340 

Hs.177961 

Hs.181551 

HS.1 10082 

Hs.76110 

H&38205 

Hs.1 12681 

Hs.172740 

Hs.181125 

Hs.71055 

Hs.1 08211 

Hs.143513 

Hs.184245 

Hs.31568 

Hs.190202 

Hs.179715 

Ks.151123 

HSJ1148 

Hs.1 87569 

H&144344 

H3208 

H&27744 

Hs.42034 

H&42658 

Hs.44278 

HS59190 

Hs.173091 

Hs.1 94704 

Hs.155140 

Hs.1 05941 

Hs.1 20858 

Hs.18268 

H&97041 

Hs.76753 



H&2722 

Hs.194703 

H&252014 

HS.7632S 

Hs.15711 

Hs240135 

Hs.43866 



ESTs 0J086 

ESTs; Highly similar to 40S RIBOSOMAL PROTEIN S10 (Rsaptens] 0.086 

armadao repeat gene deletes In velocarcfiofadal syndrome 0.085 

CCAAT/enhancer binding protein (C/EBP); epsHon 04)86 

transoiption factor 21 04)86 

ests om 

ESTs 04)85 

Myosin, Heavy Polypeptide 9, Non-Muscle 04)86 

Homo sapiens mRNA; cONA DKFZp564B176 (from done DKFZp564B176) 0.086 

ESTs 04)86 

regulator of Fas-Induced apoptosis 04)88 

Peroxisome Proffierator Activated Receptor (Gb230972) 0J087 

QTP-bmding protein homologous to Saccharomyces cerevteiae SEC4 04)87 

STAM-flke protein containing SH3 and riAM domains 2 0.087 

Luteinizing Hormone, Beta Subunft 00)87 

C08 antigen; alpha polypeptide (p32) 04)87 

adenomatous polyposis cod tike 04)87 

smaOproGne-nch protein 2A 04)87 

K1AA0676 protein 04)87 

uroporphyrinogen decarboxylase (1087 

protease; serine; 1 (trypsin 1) 04)87 

ESTs 0.087 
ESTs; Weakly similar to peroxisomal short-chain alcohol 

Dehydrogenase (Haptens] 04)87 

ests omi 

zn12c5.s1 Stratagene hNT neuron (#937233) H sapiens cDNA done 

IMAGES4728 3' similar to TRfll 151228 G1151228 LPG1P. •„ mRNAseq 04)87 

EST 0.087 

Human Ctoomosome 16 BAG clone CIT987SK-A-388D4 0.087 

ESTs 04)87 

ESTs 04)88 

ESTs; Weakly similar to F17A92 [Cetegans] 04)88 

ESTs 0.088 

ESTs 04)88 

microtubule-associated protein; RP/EB family; member 3 04)88 

Immunoglobulin lambda gane cluster 04)88 

ESTs 04)88 

ESTs 04)88 

phosphoprotain regulated by mtogenic pathways 0.088 

K1AA0929 protein Msx2 Interacting nuclear target (MINT) homolog 04)88 

ESTs 04)88 

ESTs 04)88 

ESTs 04)88 

Human mRNA for p52 and p64 isoforms of N-Shc; complete cds 04)88 

DKFZP586E0820 protein 04)88 

ESTs 04)88 

EST 04)88 

glucagon receptor 04)88 

RAB3A; member RAS oncogene family 04)88 

ESTs; ModeratBry similar to T-complax protein 10A[Ksapiens] 04)88 

ESTs 04)89 

ESTs 04)89 

EST . 0j089 

DKFZP434K151 protein 0.089 

teudne-rfch; glioma Inactivated 1 04)89 

casein kinase 2; alpha 1 polypeptide 04)89 

bagpipe homeobox (Drosophila) homolog 1 04)89 

ESTs 04)89 

adenylate kinase 5 04)89 

ESTs 04)89 

endogBn (Oster-Rendu-Weber syndrome 1) 04)89 

ESTs 04)89 

Inositol IftS-trisphosphate 34dnaso A 04)89 

adaptor-related protein complex 4; mu 1 subunit 04)89 

EST 04)89 

Human lg J chain gene 04)9 

K1AAQ639 protein 04)9 

H4 histone family; member J 04)9 

ESTs 04)9 
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132404 


AA393903 


Hs.4768 




122695 


AA456048 


H&S9403 




125975 


AA495891 


Hs.152290 




110783 


N23669 


HS26407 


5 


128860 


AA410343 


Hs.129826 




120740 


AA302650 


Hs.96654 




119564 


W38206 






134474 


AAD54746 


HsJ379 


10 


118014 


N95435 


HS55144 


1(8791 


F10669 


Hs.13228 




117605 


N35073 


Ha.44433 




121589 


AM16627 


Hs.191598 




104326 


D81655 


Hs.143067 


15 


129861 


N69507 


Hs.129849 


102795 


U88667 


Hs.188396 




119628 


W49499 


Hs.184456 




110516 


K56894 


Hs37368 




105382 


AA236853 


Ks.111801 


20 


123754 


AA609964 


Hs.102021 


108006 


AA039430 


HsX1820 




121057 


AA398619 


Hs.142375 




123675 


AA609474 


Hs.1 12713 




135194 


C20975 


H&9813 


25 


127070 


AA641812 


Hs.190037 


134051 


S67070 


Hs.78846 




133382 


AA1 12532 


Hs.7247 




103615 


Z46967 


Hs.115460 




118457 


N66593 


Hs.49230 


30 


118504 


N67334 


H&50158 


112915 


T10176 


Hs.4254 




132088 


AA470121 


H&2439S0 




101504 


M27288 


H&248156 




112550 


R71391 


H&29074 


35 


128551 


H09058 


H&237323 


112879 


T03541 


Hs.115960 




127079 


AI364691 


Hs.128628 




101993 


U01062 


H&.77515 




113020 


T23830 


Hs.7303 


40 


120465 


AA251505 


Hs.130861 




Ud£0<K> 


H« 151 1<LQ 
no. 131 139 




104941 


AA065169 


Hs.17805 




110090 


H16076 


Hs.6915 




135375 


AA480888 


HsX9741 


45 


123769 


AA620418 


Hs.1 12861 


118968 


N93438 


Hs.78907 




116969 


K80633 


HS.143038 




125147 


W38150 






100838 


HG411S-HT4383 




50 


114726 


AA132509 


Hs.103827 


107311 


T57738 


H&174112 




112863 


T03148 


Ks.4610 




129290 


AA521407 


Hs.110095 




103384 


X92762 


H&79021 


55 


112508 


R68213 


HS28847 




111863 


R37495 


H&23578 




131184 


AA452705 


H&23954 




107420 


W26567 


Hs.4775 


60 


111768 


R27606 


K&24185 


112290 


R53940 


HS26016 




130581 


AA481982 


Ks.16258 




120744 


AA302772 


H&228649 




112226 


R50761 


H&25738 


65 


116154 




HS57100 


102640 


U67674 


HS.194783 




129797 


X53595 


Hs.1252 




102705 


U77180 


H&50002 




132408 


AA035547 


H&47822 




108441 


AA079079 





ESTs Oj09 

ESTs; Moderately sinfiar to unduBn 2 [Rsapians] Oj09 

ESTs; Highly simflar to PACAP type-3MP type-2 receptor [Rsaptens] Oj09 

ESTs 0X9 

to traspan transmembrane 4 super family 0.09 

EST 0X9 

Accession not feted In Genbank 00)9 

ESTs 0X9 

ESTs 0.09 

DRE-antagonist modulator; caJsenffin 0X9 

ESTs 0j09 

ESTs 0X9 

ESTs 0X9 

DKFZP564M182 protein 0X9 

ATP-binding cassette; sub-family A (ABC1); member 4 0X8 

ESTs; WWysmlr toll ALU SUBFAMILY SX WARNING ENTRY 0 [Usaptens] 0j09 

EST 0X9 
Homo sapiens mRNA; cDNA DKFZp564H2023 (from clone DKFZp564H2023) 0.09 

ESTs 0.09 

ESTs 0.09 

ESTs; Moderately similar to putafive envelope protein [H^apiens] 0.091 

EST _ 0X91 

ESTs; Highly Mar to angtopoistin-related proSln [H .sapiens] 0.091 

ESTs 0.091 

heat shock 27kD protein 2 0.091 

ESTs 0J»1 

caBctn 0X91 

EST 0.091 

ESTs 0.091 

ESTs 0X91 

HLA-B associated transcript-3 0.091 

oncostafinM 0.091 

ESTs 0.091 

N-acetylglucx)samlne-phosphate rmrtase; DKFZP434B187 protein 0.091 

ESTs 0.091 

ESTs; Moderately similar to CL3BC [Rjiorvegicus] 0.091 

Inositol 1 ^^-triphosphate receptor, type 3 0.091 

ESTs; Weakly similar to PROHIBITUM [H.sapiens] 0.091 

ESTs 0.091 

E74-Cke factor 4 (ete domain transcription factor) 0.091 

ESTs 0.091 

ESTs 0.091 

ESTs; WeaJdy simflar to BRAIN PROTEIN H5 [H^apiens] 0.091 

ESTs 0.092 

ESTs; Highly dmBar to HSPC0Q2 [tisaplens] 0.092 

ESTs 0.092 

Accession not Ested m Genbank 0.092 

Olfactory Receptor Or17-201 0.092 

EST 0.092 

ESTs 0.092 

EST 0.092 

ESTs 0.092 
tafazzin (cardiomyopathy; dilated 3A (X-inked); endocardial 

fibroelastosis 2; Barm syndrome) 0.092 

ESTs 0.092 

ESTs 0.092 

ESTs; WeaWy similar to KIAA0584 protein [H^apiens] 0.092 

ESTs 0X92 

ESTs 0.092 

ESTs 0.092 

ESTs; Weakly similar to RAS-RELATED PROTEIN RAB-5A [H^apiens] 0.092 

EST 0.093 

ESTs 0.093 

ESTs 0X93 

solute canter family 10 (sodium/bile add ootranspoiter family); member 2 0X93 

apoIlpoproteJn H (beta-2-gtycoproteIn I) 0X93 

small tnductbte cytokine subfamily A (Cys-Cys); member 19 0.093 

K1AA0380 gene product; RhoA-specific guanine nucleotide exchange factor 0X93 
zm97c9.s1 Stratagene colon HT29 (1937221) Homo sapiens cDNA done 
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108145 
106466 
101687 
121284 

117824 
115771 
102303 
131405 
112909 
124173 
112488 
130554 
106413 
111711 
117595 
113813 
107769 

114966 

130297 
109589 
112592 
102314 
116128 
106809 



130607 
120592 
117230' 
105948 
101333 
101909 
106959 
127034 
134430 
120342 
104450 
130902 
102708 
107373 
123569 
102687 



100283 
102747 
107798 



116010 
117155 
133094 
113174 
102016 
130126 
134813 
132055 
122229 
127574 
134432 
128052 
101637 



AA054133 
AA449990 
M64358 
AA401958 

N49065 

AA422049 

U33Q53 

U78255 

T10069 

H41281 



AA447964 

R22891 

N34933 

W45174 

AA018449 

AA250743 

H94949 



R77631 
U34038 
AA459915 
AA479704 



AA043894 

AA281929 

N20535 

AA404597 

L47738 



AA497031 



H52105 

AA207105 

L77564 

AA424530 

U77594 

U85773 

AA608952 

U73379 

AA034951 

D43642 

U79303 

AA019346 

AA608907 



H97536 

AA115572 

T54659 

U03270 

AB002318 

X14767 

N69440 

AA436198 

AA907314 

AA053022 

AA878398 



133079 



X92972 

AA477561 

AA196979 



IMAGE545872 3 1 similar to contains element MER22 MER22 repetitive 

element ;, mRNA sequence 0.093 

Hs.63085 ESTs 0.093 

Hs.76057 lysophosphoBpasen 0.093 

Human rhom-3 gene, exon 0.093 
H&240170 ESTs; Moderately similar to alternatively spliced product using 

exon13A[H.sapiens] 0-093 

Hs.125201 ESTs; Weakly sirriar to B7 [NlmuscuJus] 0.093 

HS40780 ESTs 0.093 

H&2499 protein kinase Oltal 0X93 

H&26468 amyloid beta (A4) precursor protein-binding; famBy A; member 2 (X1 1 -like) 0.093 

Hs.101094 ESTs 0.093 

Hs.107619 ESTs 0.093 

H&28788 ESTs 0X93 

Hs.159637 valyHRNA synthetase 2 0X93 

Hs.6311 ESTs 0.093 

Hs.7093 • ESTs 0.094 

Hs44664 EST 0X94 

H&31382 ESTs 0X94 
Hs.125220 Homo sapiens DNA from chromosome 19K»smkJsR3Q1Q2:R29350-.R27740 

containing MEF2B; genomic sequence 0.094 
H&92198 ESTs; Highly similar to caldum-regulated heat stable protein 

CRHSP-24[hLsaplens] ~~ 0.094 

Hs.171955 trophinlrhasststing protein (tasfin) 0.094 

H&6581 ESTs 0.094 

H&29128 ESTs 0.094 

Hs.154299 coagulafion factor II (thrombin) receptor-like 1 0.094 

Hs.112193 rmitS(Eccflhomofog5 0X94 
H&220324 Human DNA sequence from clone 283E3 on chromosome 1p36.21-36.33. 

Contains the alternatively spliced gene for Matrix Metalloproteinase In the 

Female Reproductive tract MIFR1; -2; MMP21/22A; -B and -C; a novel gene; 

the alternatively spficed CDC2L2 gene for 0.094 

Hs.16603 ESTs 0.094 

Hs.143974 ESTs 0X94 

HJU3265 melastatinl 0X94 

Hs.7133 ESTs 0-094 

Hs.80313 p53 inducible protein 0X94 

Homo sapiens mRNA for PLE21 protein; complete cds 0X94 

Hs.8657 ESTs; Highly similar to CTG7a (H sapiens] 0X94 

ESTs; Wklysmlrto gtucose-6-phosphatase catalyfic subunit [R.norvegicus] 0.095 

Hs.8309 WAA0747 protein 0.095 

H^45068 Homo sapiens mRNA; cDNA DKFZp434i143 (from done DKFZp434l143) 0X95 

Hs.103978 serine/threonine kinase 22B (spermiogenesfe associated) 0X95 

HS21061 ESTs 0X95 

H&37682 retinoic edd receptor responder (tazarotene induced) 2 0X195 

Hs.1 54695 phosphomannomutase 2 0X95 

Hs.195292 ESTs; Wealdy simlar to RNA heilcase HDB/DICE1 [H^apiens] 0.095 

Hs.93002 uKquffin carrier protein E2-C 0.095 

Hs.106893 ESTs 0.095 

H&2430 transcripfcn factor-Bkel 0.095 

H&82482 protein predicted by done 23882 0.095 

Hs.60918 EST 0X95 

Hs.1 12614 EST - 0.095 

HS56421 ESTs; Wealdy similar to Similarity to Kinfluenza ribonudease PH [Celegans] 0X95 

Hs.42391 EST 0.095 

HsX4746 chloride intracellular channel 3 0.095 

Hs.9779 ESTS 0.095 

Hs.1 22511 centrin;EF-hand protein; 1 0.095 

Hs.1 50443 KIAA0320 protein 0.095 

Hs.89768 gamnra-ajnbobutyrlc add (GABA) A receptor; beta 1 0.095 

Hs-38132 ESTs 0.095 

Hs.103902 ESTs 0.096 

Hs.188905 ESTs 0X96 

HsX312 ESTs 0.098 

Hs.190491 ESTs 0.096 

Hs.132834 hematopoietic protein 1 0X96 

HsX0324 protein phosphatase 6; catalytic subunit 0.096 

H&6449 ESTs 0.098 

Hs.104129 ESTs; Wealdy similar to protease [H .sapiens] 0.096 
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107640 


AA009615 


H&257808 


ESTs 


0.096 




123389 


AA521176 


Hs*21231 


ESTs 


0.098 




103222 


' X74795 


HS.77171 


mfnichromosome maintenance deficient (8. cerevisiae) 5 (cell division cycle 46) 


0.096 




111704 


R22450 


H&23396 


ESTs; Highly similar to ZINC FINGER PROTEIN 140 [H.saplans] 


aQ96 


5 


126656 


AAS06523 




EST177475 Jurkat T-cefls VI Homo sapiens cONA 5 1 end, mRNA sequence. 


0733 




127071 


AA250806 




ESTs 


0.096 




114550 


AA058755 


Hs.151714 


ESTs 


0.096 




125955 


AI356943 


Hs.143761 


ESTs 


0.098 


10 


134363 


M37C33 


H&82212 


CD53 antigen 


0.096 


126550 


W76492 


Hs.170142 


ESTs 


0098 




122598 


AA453465 


H&99329 


ESTs 


0.096 




116898 


N90703 


Hs^238 


K1AA0478 gana product 


0096 




117661 


N39092 


Hs.44940 


ESTs 


0096 


15 


120996 


AA398281 


Hs.143684 


ESTs 


oxm 


123388 


AA521172 


Hs.134417 


ESTs 


0.098 




106700 


AA463929 


H&28701 


ESTs 


0.096 




112962 


T16814 


Hs.6828 


ESTs 


0.096 




121262 


AA401372 


H&97723 


ESTs 


0.096 


20 


134551 


H44839 


H&8526 


Hbeta-1 ^-N-aoetylglucosamlnyltransf erase 


0.096 


112060 


R43754 


H&21164 


ESTs 


0.098 




134678 


AA039935 


K&182595 


dynetn; axonemal; Oght polypeptide 4 


0.098 




100855 


HG4234+U4504 




Methytenetstrahydrofolate Reductase 


0.097 




132414 


N91193 


H&48145 


ESTs 


0.097 


25 


112900 


T08758 


H&3813 


ESTs 


0.097 


115989 


AA447777 


H&93135 


ESTs 


0.097 




103561 


Z21488 


Hs.143434 


contactin 1 


0.097 




131087 


AA009738 


H&22824 


ESTs; Weakly similar to p1 60 myb-bindlng protein [RmuscuJus] 


0.097 




120293 


M190859 


Ks.191428 


ESTs 


0.097 


30 


111830 


R36081 


H&25085 


EST 


0097 


113654 


T95770 


Hs.17668 


ESTs 


0.097 




132675 


AA179338 


Hs-5476 


serine proteinase InWbftor 


0097 




120182 


Z40125 


H&91968 


ESTs 


0.097 




132879 


U16282 


H&5881 


ELL gene (11-19 lysine-rich leukemta gene) 


0.097 


35 


134211 


AA056681 


H&80021 


ESTs; Weakly similar to 62D9^> p jnelanogasterj 


O097 


115448 


AA284845 


Hs.165051 


ESTs 


0.097 




118118 


N56901 


Hs.47995 


ESTs 


0.097 




107598 


AA004528 


Hs.169444 


ESTs 


0.097 




128933 


H01824 


Hs.760 


GATA-blnding protein 2 


0.097 


40 


114892 


AA235988 


H&86Q24 


ESTs 


0,097 


101922 


S75168 


H&274 


megaka^ocyta-assodated tyrosine kinase 


0.097 




105444 


AA252374 


Hs.19333 


ESTs; Weaidy similar to ATP(QTP)-blndlng protein [H^aplens] 


0.097 




128155 


AA926843 


Ks.143302 


ESTs 


O097 




116276 


AA485870 


Hs.44914 


ESTs 


0.097 


45 


111964 


R41227 


H8J21860 


ESTs 


0.097 


135100 


AA398926 


H&251108 


Homo sapiens mRNA; chromosome 1 specific transcript K1AAD493 


0.097 




124872 


R69251 


H&101506 


EST 


OQ97 




103084 


X59932 


Hs.77783 


c-src tyrosine kinase 


OQ97 




124138 


H23199 


Hs.107010 


ESTs 


0.098 


50 


130048 


R31745 


H&211612 


SEC24 (S. cerevisiae) related gene family; member A 


0098 


100208 


D26129 


Hs.78224 


rfbonudease; RNase AtamBy; 1 (pancreatic) 


0.093 




123537 


AA608775 


H&1 12589 


ESTs 


0.098 




118999 


N95019 


H&55092 


ESTs 


0.098 




119847 


W60384 


H&9853 


ESTs 


O098 


55 


112819 


R98618 


Hs-35984 


ESTs 


O098 


131080 


J05008 


Hs^271 


endothefin 1 


0098 




127353 


AA190853 


Hs.155360 


ESTs 


0.098 




132068 


X66365 


H&38481 


cycGn-dependent kinase 6 


0.098 




105744 


AA293436 


Hs.12909 


ESTs 


O098 


60 


133680 


M92357 


Hs.101382 


tumor necrosis factor; alpha-Induced protaln 2 


0.098 


122899 


AA463960 


HS.178420 


ESTs; Highly similar to WASP Interacting protein [Rsapiens] 


0.098 




128700 


U59266 


Hs.103982 


small tndudble cytokine subfamily B (Cys-X-Cys); member 1 1 


0.098 




104393 


H46488 


H&226499 


nesca protein 


O098 




123320 


AM9S792 


Hs.139572 


EST 


0X198 


65 


129169 


N31641 


Hs.109058 


ribosomal protein S6 kinase; 90kO; polypeptide 5 


0.098 


135093 


U51333 


H&159237 


hexokmase 3 (whBe cell) 


0.098 




113269 


T65159 


HSJ85044 


ESTs 


0.098 




124283 


HB5783 


Hs.194138 


ESTs; Moderately similar to zinc finger protein RINZF [FLnorvegicus] 


O098 




114376 


GMCSF 




Accession not listed in Genbank 


O099 




100881 


HG445&WT4727 




Immunoglobulin Heavy Chain, Vd)c Regions (Qb±23563) 


O099 
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116572 


D45654 


Hs.65582 


DKFZP586C1 324 protein 


0X99 
0.099 


123956 


AA621747 


H&112847 


EST 


100818 


HG4018-HT4288 




Oploid-Binding Cell Adhesion Molecule 


0.099 


132754 


W47419 


H&56007 


Human DNA from chromosome 19-speciftc cosmid F25965; genomic sequence 


osm 

0X99 


112741 


R93080 


Hs.35035 


ESTs 


112748 


R93299 


Hs.166492 


ESTs 


0X99 


130858 


857235 


Hs£4638l 


C06B antigen 


0X99 


124870 


R69233 


Hs.101504 


ESTs 


0X99 


125304 


Z39833 


Hs.124940 


OTP-binding protein 


0.099 


121297 


AA401995 


Hs.97850 


ESTs 


0X99 


128602 


AA046103 


Hs.102367 


ESTs 


0X99 


124062 


H00440 


Hs.144524 


ESTs; Weakly similar to signal transducer and activator of 


0X99 








transcription 2 [Mmusculus] 


100547 


HG214&WT2219 




Mucin (Gb:M57417) 


0X99 


105652 


AA282505 


Hs.19015 


ESTs 


0X99 


133390 


AA459945 


H&72660 


KIAA0585 protein 


0J599 


133503 


M33195 


Hs.743 


Fc fragment of IgE; high affinity 1; receptor for; gamma polypeptide 


0X99 


109461 


AA232667 


H&58210 


ESTs 


0X99 


102068 


U09117 


H&80778 


phosphoflpase C; delta 1 


0X99 


113464 


TB5931 


Hs.16295 


ESTs 


0X99 


104240 


AB0Q2368 


Hs.70500 


K1AA0370 protein 


0X99 


121113 


AA399109 


Hs.161813 


ESTs 


0.1 


122896 


AA469952 


H&97899 


ESTs; Weakly similar to da!2; teru343; CAL* 0.17? AUCJTEAST P25335 


0,1 








ALLANTOICASE [Sxerevisiae] 


102405 


U43148 


Hs.159526 


patched (Drosophiia) homotog 


0.1 


103599 


Z33905 


HsX1218 


receptor-associated protein of the synapse; 43KO 


0.1 


121079 


AA398719 


Hs.14169 


ESTs; Weakly similar to CREB-binrfing protein [tisapiens] 


0.1 


115820 


AA427487 


H&39819 


ESTs; Weakly similar to RETK5ULOCALB1N 1 PRECURSOR [H^apiens] 


0.781 


125106 


T95766 


Hs.189760 


ESTs 


0.1 


131373 


N68116 


H&26146 


Down syndrome critical region gene 3 


0.1 


120224 


Z41239 


Hs.106960 


ESTs 


0.1 


133090 


A/V448228 


H&6468 


ESTs 


0.1 


132300 


AA133244 


Hs.44234 


ESTs 


0.1 


113129 


T49384 


H&8988 


EST 


0.1 


110638 


H73197 


Hs.17241 


ESTs 


0.1 


131364 


R53255 


HS26010 


ESTs 


0.1 


105370 


AA236476 


H&22791 


ESTs; Weakly similar to transmembrane protein with EGHIke and two 


0238 








fofflstafirWlte domains 1 pisapiens] 
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TABLE 1 1 A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 11. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
5 Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



10 



15 



20 



25 



30 



35 



40 



Pksy: Unique Eos probeset identifier number 

CAT number. Gene duster number 

Accession: Genbank accession numbers 



Pkey CAT number 
100610 19864J 



100874 21517.2 



108559 41469.9 

100721 19818J 

100748 41861J 

100750 15759J 



45 100751 24700J 



50 



55 



60 



65 



100760 1334.7 
100775 18179.3 



Access Ion 

AW161357 AI87S0S2 AK28938 AW161097 AW161 187 BE314485 AA351716 F07096 AA179034 F08510 F00653 AJ936671 
AA476718 AW772454 AI807703 R44253 AA976687 AI98518S A1650254 K38942 RB4829 AA018724 AA001000 H85934 
AA019126 H85609 AA017000 AA339355AW950556D51397AA213981 BE548002 A1055359 AA001560 AW9521 13 
AA317769 AI857477 AI857475 AW249771 AW162661 H38943 AA018628 R85885 A1984613 AI934765 AI796172 AW157488 
A1929191 R85523 051221 D53851 H85610 A1749674 F21582 AA323145 AA019127 AA687444 T06745 A1699293 H29532 
AA214029 AA223658 NM_01 6834X14474 R19697 H09695 R17455 R13812 R19056 A1681231 AI59020O R37671 AA861828 
AI990Q23 A19356G9 AW005821 AA324581 H17335 R37659 R42802 R46242 R60938 R59731 H28993 AA479907 R44570 
AI890696 AA308884 AA507078 R41274 AI365507 T16348 A1560453 F03259 F04722 T16312 AA016081 AW073061 
BE314824 W28930 R44098 R51045 

AW403342 AW248986 BE561709 AA357312 BE31 1834 BE389496 BE294887 AW732696 BB047B88 AI702383 BB019155 
AI702367 BE408968 BE280458 BE313759 BE513492 BE535404 BE280258 AC005263 NM.007165 121990 AW732711 
AI564920 AW249094 BE265365 AW607186 AW607346 BE005217 H27211 U46230 BE260066 BE207043 BE546782 
AW248659 
AA085228AA085161 

L4O904 NM.005037 X90563 AB005526 H2159B AA088517 
X06Q96 X05826 

BE157260 BE157265 R481 18 H43827 Z17877 AW379070 AW291778 M20605 J03253 M14206 V00568 AIB60465 AW296022 
M13930 AL047400 J00120 BE018476 AW675223 726980 F06694 R2Z709 R24720 H22753 AI903100 AI903094 AW937823 
X00384 D10493 K01904 KD1906 K00535 100058 AA410662 AW384760 AA304930 AI680985 X00198 H58025 AW998901 
AV653447 N31654 AW610357 AW610369 AWB82480 BE223010 AW384172 AW384219 AW384171 AW384218AA298522 
BE140421 AW945162 AW751711 AA514409 AW747912 A214214 W87741 AA972406 AA554513 BE302087 AB49030 
AA477850 AV653129 A12813S0 AI274110 W87881 AA841388 X66258 AI051600 AA877139 AA527483 AA857219 AI250782 
AA625531 AA807892 A1278811 A1224033 R24033 AA593396 AW1 20709 R45453 N22772 AA235530 T29737 A101 6409 
AI688907 AA568370 AA722760 AI539329 AA550843 AW674698 AI538452 AI538453 AI337957 AA477744 AA464600 
A1140319 AW949294 AI339781 AI828738 AA923834 AA344094 AE78350 AA975567 AA908416 AAB57170 AW023520 
R43413 R48004 F02958 AI989439 R1 1207 AA737S07 D1 0493 AW950652 AI093842 AM74024 AA703369 R1 1 284 M13930 
M1 3930 M13930 M 13930 Ml 3930 J00120 M13930 M13930 X00364 J00120 R19507 AA639812 
N32759 N29730 N30831 N32604 N31955 AI206390 H87574 R23494 AI186215 N30036 AT741512 J00117 NM.000737 
AI453628 AA330974 AI188729 AI188604 AI188964 N30276 AJ188947 AI188830 AI188303 AI200457 AI219166AI192459 
A1183280 AI189275 AI188639 All 86353 AI189616 AI184224 AI1 30720 All 88454 AI188391 A! 1 48857 Al 192447 A1209155 
A1190013A1206355AI188721 AI189429 AI189364 A1188330 AJ431595 Al 189595 AI188781 AI148647AI200Q22 AI221552 
A1220923 AI188728 AA233034 AI189807 AI189641 AI219044 AI148774 AI200658 W71989 AI207360 AI1 88824 A1200559 
A1200270 AA644163 AI199943 AI151301 A1189555 A1262724 AI148590 AI148695 A1126906 AI149163 K03183 K03189 
A1189842 A1221014 N30608 AI1 86465 AI220865 AI188498 A1138226 AI189968 AI221019 A1138197 A1149428 A1148904 
AI186218 AI188348 AI160579 A11984B0 AI149039 AI160336 AI219055 A11B4784 A1221580 A1161082 A1160814 AI123896 
A1417614AI126101 AI188872 AI149571 AI168533A1149072AI149467 AI1 31286 N30684AI1 60705 AI160692A1149559 
A1273580 AI189442 AI1 38448 All 49591 N27302 AA4O0910 AI138431 AI138435 AI128407 N30216 A1128296 AI219589 
AI188492 A1149447 AJ 168482 H95374 AI219009 N31616 AE76216 N32233 A1291937 N30741 A1188889 N27111 R23214 
AI221605 All 84348 AI200375 H94451 N26397 A1871881 AA232905 N30833 AI220780 K94446 N30822 H87464 R68815 
N302S0 AI128424 Hi 2587 T47334 H87631 H87156 AK19133 A1868741 AA330859 H86993 AA330413 H93658 N30817 
T90191 H93668 AE00Q54 H95207 T4731 6 H95381 T49170R00880T49171 N27381 H94107 R53352 TB5053 AW451899 
H95142 N30313 H94015 H86987 T28278 N29701 C18834 AA331267 AA330939 AI554493 N27073 N29831 R681 13 N30758 
R26086 N32108 H95135 AA330414 AA33Q978 AE1&422 A!189453AI1 99951 X00264 NM_00C894 AA371909 AA063496 
T29543 AA371971 AA372026 AA371978 AA371346 AI051683 A1166418 AI220659 AI189068 AE19266 A! 186552 A!1 88715 
AI149156 

AW794626M27126M27014 

J05581 M61170 T27692 M34088 M34089 AW860335 AW578047 AW610437 AW610386 AW610422 AW610473 AW579078 
AW604897 AW860163 AW579067 AW862410 AI816584 AW177757 AW602769 AI909790 AW860331 AI909787 AJ9Q981 1 
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10 



15 



20 



25 



30 



35 



40 



45 



65 



100800 24735J 



100818 19804.3 

100881 458J27 

100885 12707.3 

100838 8542.1 



102459 3556.1 
126128 1630017.1 
102620 16821.37 
102673 24986.6 
102675 5145.4 
102753 2226.1 
102789 34624.4 
127034 51148L2 



50 103522 21640.1 



127071 188097.1 

55 126456 291965.1 

119388 1762256.1 

126856 20669.1 



60 103996 224545.1 



113213 23798.1 



134947 844579.1 
129311 16078.1 



AI909813 AW845083 AI9Q5920 AW387919 BE140766 AI909279 AW369405 AA429321 AA429320 AA367451 AA847972 
AW001137 A1567905T84561 A1631295 AA1 51351 H02932 AB84519 AA367457 AW369421 A1678846 AW391803 AI610869 
AW192838 A19222B9 AI952140 A1910233 A1479474 AW001395 AA488073 AI985760AW1 30017 A1858369 AA627845 
AW081 805 AA156865 AI624443 AA344985 AA569793 R72486 A15B9329 AI903204 AI269893 AA541284 A1279932 AA149270 
AI697120 AA729146 AI589353 AA480067 AI923310 AA530908 AI275395 AA425062 AA580280 AA889527 AA158866 
AW131341 AA573028 AA877326 T29335 AW951288 H04235 AA099243 AA994659 A1659618 AA887919 AI299297 
AW001 1 16 AW263844 AI270578 AA970828 AW572126 AA775299 AW369449 AW389398 AW369452 AI933677 AI870710 
AJ092911 AI582464AM97674AA937Q26AA885865L38597AA908325AW369432AWQ26^ 
AA932409 A1 187328 AI672970 A1888098 AW440471 A W1 38860 AI866858 AI802528 A1926172 AW243914 AI933690 
AA996114 AA536189 AW009937 A191B060 A1270379 AJ973169 AW175638 AW369413 

NM.006227 L26232 R50649 AU077024 AL008726 AA41 1079 R35151 BE2781 53 BE278139 A1459777 R88036 Z43210 
F07326 AF052157 R 17844 BE615476 T82160 R71985 H21963 AA299158 AW368246 R48123 R50628 R70441 K27245 
H72015 R72345 R39392 A1909738 BE612778 BE613234 D521 16 D52136 D52132 D52067 D51922 D51995 051905 N34249 
N25459 AA464438 AA297350 AA297466 R81738 H02737 AW582505 R27523 AI834241 AW130867 W72668 W76426 
AA358363 R5Q262 AW473880 H52335 H43953 K21964 T39505 AI887517 AW156925 AW839850 H02628 AW007705 
AI561008 F22392 R71279 AA995433 R50725 W24462 R71931 AA464437 AW591731 R25667 R52695 R5081 0 AI560805 
AI089266 H68388 H41353 K28590 AW001860 A114162S AA250773 AI284778 AW51 1412 AWD83975 AA130377 AW026047 
R50551 R81494 AI357666 A1078272 F32666 F36981 AW304865 H43906 AAS31068 R48010 A1540217 AI017339 AI291812 
AI741954 AA458490 AI088378 AA298764 H61 168 AA358362 AA298725 AA298515 AA464148 AA443538 R43046 AA084314 
T40641 T47608 T48940 A1082477 AW470145 N92284 AI758958 AA298512 AA284586 A1597777 AA480277 AI932559 
A1869081 AA476615 AA503651 AI656024 AW168522 AI682051 AI689106 A1274592 AI520917 BE258916 BE615881 
BE28Q282 R53386 BE278255 BE278398 T47607 AA477662 H68385 

100817 19648.1 L34355 U6810 NM.000023 U08895 AA424260 A1097272 AA4241 62 N79764 F1 9290 F25278 A1479385 
AA460662 AA432059 AW016935 F25770 F32549 F36677 F33016 F35992 F36010 AW1 72497 AA835076 F28727 AA211643 
AA453282 

U79251 AA843851 R38201 R66461 R44908 AA683289 H17477 R37364 R52632 AW298336 AA351391 NNL0Q2545 L34774 
AA296886 AW967001 T28889 R13451 T77331 AL1 19196 AL1 18830 H08459 AW892812 AW905838 H17585 R52878 
BE561958 BE561728 BE397612 BE514391 BE269037 BE514207 BE562381 BE514256 BE514403 BB14250 BE397832 
BE269598 BB59865 BE396881 BE560031 BS14199 BE560Q37 BB60454 
X07881 KM.006249 X07637 AA376715 AA376677 X07715 X07704 S80916 

BE387614 R51501 AA199714 AW674779 F08178 BE269071 AA376313 H08264 AA380420 H18785 AL042151 BE277758 

BE267438 NKL005850 L35013 BE540833 BE390902 BE391494 BE277459 BE385592 BE390612 BE384263 BE3S7779 

BE388647 BE537373 BE547158 AW409585 AW374033 AW602185 AA355725 AW577548 AW935015 AW935160 W40232 

AW938647 AW374332 AA434040 BE293488 AL138361 BB60260 AI745075 AA317980 AW948382 AI834311 AJ653582 

A1831042 AE361678 AA618606 AA729052 AI424969 AA199715 AW769374 AI828422 AW044307 A1B62816 AJ203583 

AW084461 AW514655 AA831883 AA290672 AA831286 AA578510 AW089965 AW150746 AA292743 H22232 AI469275 

AW439312 AA282744 AW471443 A1473989 AA593336 AA464070 AI678937 AW069451 AA970763 AA610480 AA593328 

AA464009 AA768985 AI298928 AA436600 AA464718 AA699361 D61482 D55935 AI369591 AA470695 AI809135 AA640627 

A1568446 R51502 W45467 AI655316 AA463934 AW168609 AW518663 BE045525 Z41251 AI668091 AA9081 60 AKJ26697 

AI886259 AJ612932 AA215437 AI956014 BE541087 BE255652 BE265878 BE394102 W27502 

U48936 L36592 X871 60 NM.0Q1039 AL036606 AL036420 U35630 AW298574 

WB0551M8537D 

AA976427U66052 

AI457548U72509 

U72512T98357 R31335 F18Q90 

L32961 NM.000663 U80228 S75578 AA425061 AA429317 AI815143 AA910669 A1286022 AI286019 
U88896 U88898 AA916056 T03285 AI341594 AI359534 A1634031 U88897 

BE397750 AA232171 BE562900 BE384894 BE242228 BE20681 9 BE261742 AA296468 AW959763 BE276164 BE264109 
BE382626 BE256735 AA301453 N55872 H01676 AA292746 AA427485 AA496400 AA352389 

Y10518 Y10514 Z83935 Y10508 AK000055 Y10519 AI142012 A1681 175 BE222219 AA890586 BE504347 BE328064 N63044 

N51228 AI151248 AI521996 AI924777 AW375954 A1860275 W00549 AI742673 AW612288 AI763062 AA632510 AI087347 

AJ088070 AI214349 AA890297 AI494156 AI698598 AA631658 AA504593 AA860733 AI266761 AW663214 AW771231 

AA839610 AI769806 AI769746 AW014326 AI28861 1 

AA250806AA459220 

AA429212 W00881 

T88798R92430 

A1084125 AI083773 A1479687 A1939609 AI968662 AF129507 NM_013282 AW971840 AW298508 AA744240 AA811217 
AA827671 AA81 1 055 AA806567 AA488977 AA908902 AI637637 AA927056 AI8701 39 AW340492 AA488755 AA129794 
AA306523 AA354253 8E256277 AC053467 AW962084 

AA321355 AW964592 R23284 H73883 R23382 N47914 C01377 H04668 AW606248 R34447 AA847136 AI684489 AI5231 12 
AW044269 AI379138 N29386 AA761543 N79248 AA960845 AA768316 AI147926 AI718599 AI880620 R67467 AI216016 
AI738663H04648 

NM.001395 Y08302 AI434619 AI470328 AI261807 AW024965 AI806537 AI830549 AI640337 A1218065 AW271700 
AWD28488 A1133339 AI859205 R51 175 U87167 BE379324 BE392008 AA340819 AA3431 1 0 T57275 D59164 AW299312 
A1434422 AI936390 AW024975 R40262 

AW269126 RQ9430 756590 AI367247 A12531 32 BE464248 T58658 AW207785 T58607 
R51 194 AI732276 R53587 AI820697 

AK000526 BE550084 W30689 AW271859 AA41 1456 AB41551 AA242990 AA243027 H87D46 D20360 AI184053 AA146956 
AI721023 AI71B944 AA146955 F18215 AA903890 AI700355 A1075430 AA41 1584 AA878210 AM76760 AW945637 AA530595 
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AA431522 AA301889 AI909058 012149 N41960 BE222214 AA6G9922 AA828176 AA393359 AA398693 AW024956 
BE487805 AW298623 AW264085 AI024454 AKJ24719 A1431927 T55087 AI51 1014 T54920 AA131253 AI436344 

114427 9724_2 AA017176 A1359979 AA047836 AA017063 AA01 6303 AA001545 

114569 110077 J AA063315 AA063316 

100106 15621.-5 AF015910 

100515 342_1 AA305746 D901 87 T63943 AW951 154 T291 82 AT734941 D13264 AI299239 Z18812 AW299859 W24476 AA933064 

AA489759 

100531 46038.1 AWB88554 AWB07282 AA319986 M28590 

100545 22955J1 M55405 AW752552 

100574 17320.2 AA326895 M1 0036 NM 0003S5 N84665 H69414 N84657 AA380453 AA329743 AA357387 AA1 88770 AA37B532 AA353653 

AA158953 AA083176 BE537313 AA181433 D53373 R57376 AA206698 R 14807 H18899 H1 1191 H93892 R25593 T61 134 
K93285 AA083081 AA831789 H13137 AA4970U AA079330 AA182861 H13138W47161 R62913 AA587089 AA211112 
AA429237 AL035923 AA100070 AW392898 A(566433 AA866006 AA2140Q2 AW392865 M79454 AA197181 AI680371 
AA176501 AA737967 AI089225 F34874 AW571437 AI620620 AA573489 AA423816 AA164917 AA458455 T47072 AI569087 
A1261656 AA730919 AI633441 AW185182 AI351622 AW243465 AI872649 AI359227 AAS37941 AJ693770 T47073 AYV779948 
AW510580 A1635626 AW527601 AA664326 AA953578 AI341418 BE222853 AI241963 AI094863 AA928380 AA493373 
AW043762 AI377783 AW958987 BE619760 AA385240 BE277975 BE280095 AW631443 AA581048 BE618715 BE299610 
C14874 BE559858 BE378455 BE618290 BE544585 AI525575 BE548897 BE2671 10 AA804738 BH269821 AA918133 
BE277647 AA599947 BE280735 BE390239 N74150 T12504 AI208197 AW955527 AA113897 N40081 H73835 H70393 
A1434041 W22950 All 92661 BE264461 W26486 AA626424 AA1 96694 T69209 AA857976 AK40287 AA410599 AA864287 
AW950564 AA013320 T49283 A1541438 AW804703 AA335534 AA335659 BB62269 BE618802 BE277850 BE546413 
BE280994 AA204813 BE561694 BE543524 BE253647 AW001452 W191 16 BE542508 AA205894 BE254875 BE270033 
AI525906 BE251792 AA975700 BE272138 AW607671 N87686 M10036 BE515060 BE298607 AI745178 U47824 H03193 

100627 tigr_HT2798 225424 

100756 tigr_HT3768 M88357 

100768 tigr_HT3846 L29141 M69180M811Q5 

100813 fiflrJfT4265 L33999 

100836 tigr_HT4383 U04688 

100855 figr_HT4504 U09806 

102104 entre2LU12139 U12139 

125091 genbanKJ91518 T91518 

100929 tigr_HT688 X65561 

125147 „entTKLW38150 W38150 

102354 entrez_U38268 U38268 

102491 entre*_U51010 U51010 

102636 entre^U67092 U67092 

118769 genbanK_N74496N74498 

101046 entrezj<01160 KO1 160 

101057 entre2j<03430 K03430 

108334 genbank_AA070473 AA07O473 

108417 483241.1 AA070853 AA075749 AA075716 

108441 g8nbanKJWJ79079 AA079079 

108786 genbanfeM128999 AA128999 

101655 entre^M6Q299 M60299 

101697 entraLM64358 M64358 

117437 flenbank_N27645 N27645 

101798 entre^M85220 M85220 

101909 entrOLS69265 $69265 

103508 entrezLY10141 Y10141 

103575 entr83LZ26256 Z26256 

119332 fienbanK_T54095 T54095 

112161 genbanK_R48295 R48295 

119564 NOT_TOUND_entrez^W38206 W38208 

114378 NOT_FOUNO_entraLGMCSF GMCSF 

100478 t3gr_HT1067 M22406 

100547 figr_HT2219 M57417 

100564 tigrJiT2324 Z11585 
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TABLE 12: shows genes, including expression sequence tags, that are down-regulated in 
prostate tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos 
HuOl GeneChip array. Shown are the ratios of "average" normal prostate to "average" 



5 prostate cancer tissues. 



ExAocn: Exemplar Accession number, Genbank accession number 

10 UnigenelD: Unlgene number 

UnigeneTOe: Unigene gene title 

R1: Background subtracted normal prostate : prostate tumor tissue 

IS Pkey ExAccn UnigenelD Unlgene Title R1 

100522 HG1763-HT1780 Protactin-lnduced Protein 17.4 

130803 M81650 Ha 1958 semenogeDnl 16.765 

118068 N53943 Hs.13743 ESTs 13225 

20 114251 239898 H&21948 ESTs 12.7 

112134 R46025 Hs.7413 ESTs 8.735 

101436 M20642 Hs.158235 Human alkali myosin tight chain 3 mRNA; complete cds 8.175 

104028 AA361094 Ha221128 ESTs 8.15 

108944 AA149204 Hs. 175783 ESTs; Highly similar to growth arrest tnducfele gene product [H^apiens] 7535 

25 103838 AA174173 Hs.12622 ESTs 7212 

120469 AA251741 H&25882 DKFZP586M1 824 protein 7.175 

110279 H29231 H&27384 ESTs 6.701 

127472 AA761378 Hs.192013 ESTs 6.642 

133301 N35229 Hs.7037 paid (mouse) homolog; pallidin 6411 

30 102457 U48807 H&2359 dual specificity phosphatase 4 6395 

114011 W90385 Hs.15082 ESTs 6.15 

101249 L33881 Hs.1804 protein kinase C; tata 6 

123265 AA491209 H&105265 ESTs; Weakly similar to reverse transcriptase [Mmuscuhis] 6 

119322 T49655 H&241569 ESTs; Modry srrdrtol! ALU SUBFAMILY SQ WARNING ENTRY I! [Haptens] 555 

35 101673 M61 906 H&6241 r^ospholnosfride-3-kinase; regulatory subuntt; polypeptide 1 (p85 alpha) 5525 

115586 AA399218 H&92423 ESTs 5.7 

120590 AA281780 Hs.111441 ESTs; Weakly similar to simitar to KruppeHkB zinc finger protein [Celegans] 5.7 

109748 F10192 Ha248323 Tubulin; alpha; braln-spe* 5.625 

134727 X80507 H&8939 yes-associated protein 65 kDa 5J5 

40 129171 AA234048 Hs.7753 calurnenin 5.486 

120390 AA233122 Hs.1 11 460 ESTs; Highly similar to multifunctional caJclum/caimoduIirHlapendGnt protein 

kinase II delta2 isoform [H.sapiens] &A 

131699 R68657 HSJ90421 ESTs; Modry smtrtoll ALU SUBFAMILY SX WARNING ENTRY U [H.sapiens] 5279 

104490 N71503 Hs.43087 ESTs; Weakly similar to dysferfln pLsapiens] 5266 . 

45 102124 U14528 H&29981 solute carrier family 26 (sulfate transporter); member 2 5.151 

109280 AA196635 H&86081 ESTs 5.134 

109707 F09739 Hs.185701 Homo sapiens mRNA full length Insert cDNA done BJROIMAGE 21920 5.075 

108087 AA045709 Hs.40545 ESTs 5.075 

135006 M21665 H&929 myosin; heavy polypeptide 7; cardiacmusde; beta 5.055 

50 119182 R80664 Hs.77067 ESTs - 5.033 

129806 R62444 Hs.173373 WAAQ931 protein 4.675 

101435 M20543 Hs.1288 actin; alpha 1; skeletal muscle 4.626 
125954 R93943 yt72c12Jl Soares retina N2b4HR Homo sapiens cONA clone IMAGE275735 5\ 4.6 

113989 W87544 K&221184 ESTs 4559 

55 104432 J03460 H&99949 prolactin-lnduced protein 4/451 

112326 R56068 H&4268 ESTs 4.45 

119063 R16833 Hs£3106 ESTs; Weakly similar to D ALU SUBFAMILY J WARNING ENTRY 11 [Rsaptens] 4.45 

130376 R40873 Hs.1 55174 WAA0432 gene product 4.301 

122484 AA448286 Hs.98074 ESTs; Highly similar to atrophb-1 interacting protein 4 [H^apiens] 4.2 

60 104142 AA447006 ESTs; Moderately slmflar to U ALU SUBFAMILY SQ WARNING 4.175 

129413 N32787 Hs.11123 ESTs; Moderately similar to hypothetical protein 2 [Rsaptens] 4.1 
103678 Z84483 Human DNA sequence from PAC 46H23, BRCA2 gene region chromosome 13q12-134j05 

114266 Z40166 K&26409 ESTs 4.05 

115206 AA262491 Hs.1 86572 ESTs 4.048 

65 123723 AA609749 Hs.1 12759 ESTs; Highly slmflar to unknown protein [Rjiorveglcus] 4.041 

129130 H97993 Hs.172788 ESTs; Weakly similar to K1AAD512 protein [rtsapians] 4.028 
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3.85 

3333 

3318 

3.792 

3.779 

3.768 

&75 

3.708 

3.707 

3.7 

3.7 

3574 

3.653 

3.625 

352 

3.614 

3513 

3j6 



20217 Z41078 Hs.66035 ESTs 
08538 AA084524 zn19d8.s1 Stratagene nsuroepitheGum NT2RAM1 937234 Homo sapiens cDNA 4.023 

34460 AA4O0030 Ha336Q ESTs; Weakly similar to 0 ALU CLASS 8 WARNING ENTRY I! [H-sapIens] 3*25 

20418 AA236010 H&26613 Homo sapiens mRNA; cDNA DKFZp586F1323 (from done DKFZp586F1323) 3*1 

32783 N74897 H&5683 DEAD/H (Asp-Gtu-Ata-Asp/His) box polypeptide 15 

25052 T80174 H&222779 ESTs; Moderately similar to similar to NEDD-4 [H^apiens] 

08600 AA0995B5 Hs.41175 ESTs 

03099 X61100 Hs.8248 NADH dehydrogenase (ubiquinone) Fe-S protein 1 (75kD) (NADH-coenzyme 

34948 H06773 H&93850 protein kinase; AMP-activated; gamma 2 non-cataiyfic subunB 

120511 AA258144 H&221578 ESTs 

881 R37460 Hs.25231 ESTs 

3966 W8S600 Hs3842 ESTs 

649 AA481254 H&30120 ESTs 

29775 R94659 Hs.12420 ESTs 

10191 H20568 Hs27182 phosphoSpase A2-acfivafng protein 

12878 R87160 Hs33665 ESTs 

27115 AA375791 Hs.131894 ESTs 

32892 W92797 Hs59378 DKFZP434G162 protein 

15023 AA252079 Hs.63931 dachshund (DrosophUa) homotog 

14932 AA242751 Hs.1 62 18 KIAA0903 protein 

06865 AA487228 Hs.19479 ESTs 

34480 AA024664 Hs33918 NADH dehydrogenase (ubiquinone) 1 alpha subcompiex; 5 (13(0; B13) . 

24780 R42493 H&220839 ESTs 

30631 AA025399 Hs.1 69737 ESTs 

34154 AA211320 Hs.79404 naurorvspecific protein 

104160 AA455706 Hs39722 ESTs; Weakly similar to 78 KD GLUCOSE REGULATED PROTEIN 
PRECURSOR 

05524 AA258158 H&22153 ESTs; Weakly similar to K1AA0352 [H^ap'ens] 

10168 H19673 Hs.176586 ESTs 

09480 AA233299 Hs.72158 ESTs 

09585 F02367 HSJ27252 ESTs 

15134 AA257107 Hs.194331 ESTs 

16083 AA455653 Hs.44581 ESTs; Weakry similar to HEAT SHOCK 70 KD PROTHN 6 [H«sapiens] 

20524 AA261852 Hs.192905 ESTs 

16932 H74330 Hs.150000 ESTs 

30746 AA256976 Hs.18800 ESTs; WeaWy similar to KIAA0579 protein [Rsaplans] 

07513 X05451 Hs.1 58295 Human alkali myosin light chain 3 mRNA; complete cds 

18641 N70298 Hs.49829 ESTs 

28584 AIQ28384 H&127331 ESTs 

05134 AA159953 H&22895 ESTs; Weakly similar to aryteuifatase B precursor [H^apiens] 

23502 AA600116 Hs.1 12526 ESTs 
H&47135 ESTs 

05691 AA287097 Ha.75356 transcription factor 4 

31505 H85897 H&27755 ESTs 

20775 AA342104 Hs36777 EST 

05579 AA278824 Hs.19218 ESTs 

28190 AA946876 Hs.148376 ESTs 
100819 HG4020-HT4290 TransgWamlnase 

30217 D29956 Hs.152818 ubiquitin specific protease 8 
Hs.106220 WAA0336 gene product 
H&89232 duomobox homolog 5 (Drosophila HP1 alpha) 

Hs.151231 ESTs; Highly similar to FYVE finger-containing phosphoinosftide kinase [Mjnusculus] 356 
Hs.185797 ESTs " 3212 



3-559 
a542 
3525 



35 

35 

3.459 

3.45 

3.425 

3.42 

3417 

3407 

3399 



3318 
3317 
3315 
3309 

as 

3295 
3292 
3288 
3273 



130068 AA608903 
34719 L07515 
10277 H29209 
27354 AM1B880 
29173 R60523 
27484 AA970504 
124923 R94500 
I22465 AA448164 
22027 AA431302 
103329 X85134 
29937 M95767 
34197 AA057341 
07764 AA018219 
121775 AA421773 
14768 AA149007 
32381 N48818 
23105 AA485973 
21176 AA400080 
25053 T80620 
05909 AA401739 



Hs.109087 ESTs 
Hs.146103 ESTs 
Hs.1 08046 ESTs 

Hs.99153 ESTs; Highly similar to CGI-73 protein [H^apiens] 

Hs.98721 EST; Weakly similar to N-copine [H^apiens] 

Hs.72984 refinobtastoma-binding protein 5 

Hs.135578 chitobiase; oWacetyl- 

Hs.87889 heBcase-mol 

HS226923 ESTs 

Hs.161008 ESTs 

Hs.1 82339 Bs homologous tactor 

Hs.46884 ESTs 

Hs.143947 ESTs 

H&97774 ESTs 

Hs.186473 ESTs 

HS5111 ESTs 
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110767 W72562 H&58119 ESTs 3.057 

115776 AA424038 K&58187 ESTs a056 

111713 R22988 H&220950 ESTs &05 

115301 AA280047 Hs.43948 ESTs 3.05 

5 118448 N66412 Hs.49189 ESTs 3 

106586 AA456598 H&256269 ESTs 2.995 

110415 H48239 H&29739 ESTs; WeaJdy similar to RAS-RRATED PROTEIN RAB-3A [Rsaptens] 2.979 

105173 AA182030 Hs.8364 ESTs 2.978 

101102 L07594 Ks.79059 transforming growth factor; beta receptor Ul (betagtycan; 300kD) 2.976 

10 110543 H58383 Hs258544 ESTs 2.976 

125593 R24464 H&202949 KIAA11 02 protein 2.964 

100824 HG4058-HT4328 Oncogene Amll-EvM, Fusion Activated 2.957 

106822 AA481068 Hs*1835 ESTs 2.95 

131983 D11930 H&3592 ESTs 2*5 

IS 111221 K68869 Hs.15119 ESTs 2.936 

113820 T93795 Hs.17252 EST 2*17 

105220 AA210695 Hs.17212 ESTs 2*17 

123234 AA490227 Hs.105252 ESTs 2*04 

125250 W87465 Hs£22926 ESTs; Weakly similar to D20922 [C.ete gans] 2* 

20 116196 AA465160 Hs.63386 ESTs 2* 

122100 AA432243 Hs.41086 EST s; WeaWy stmBar to 0XY5TER0L-BINDING PROTEIN [H^aplans] 2*96 

111712 R22905 Hs.1 13716 ESTs 2*95 

126589 W78107 Hs.187698 ESTs; WeaJdy similar to YerWOwp [S.cerevisiae] * 2.895 

111132 N64378 Hs.1 3149 ESTs; Highly simSar to unknown funcfion [H^aptens] 2*94 

25 115307 AA280300 Hs.191346 ESTs 2.886 

108989 AA152263 Hs.1 8827 KIAA0849 protein 2*83 

129486 H03686 H&220689 Ras-GTPase-activatino protein SH3-domaIn-bindlng protein 2*79 

119805 W73788 Hs.43213 ESTs 2*75 

125721 R59881 Hs.7503 ESTs 2*71 

30 103704 AA028171 Hs.1 53888 ESTs 2*68 

128420 AI088155 Hs.14146 ESTs; Weakly similar to unknown [H^aplens] 2*66 

120571 AA280738 Hs.128679 ESTs 2*63 

.123059 AA482019 H&238202 EST 2.86 

129462 D84239 Hs.1 1 1732 IgG Fc binding protein 2.856 

35 125166 W45491 Hs.1 72609 nudeoblndin 1 2*54 

125992 W01628 za36e07j1 Soares fetal Bver spleen 1NFLS Homo sapiens cDNA done 2*52 

109431 AA227972 Hs.43635 ESTs 2*5 

105077 AA142919 H&5558 ESTs 2.847 

131388 R34531 H&92200 KIAA0480 gene product 2,846 

40 121080 AA398720 Hs.1 77953 ESTs 2*38 

112575 R73816 Hs.17385 ESTs 2*36 

130244 R26206 Hs.153293 KIAA0701 protein 2*25 

134698 AA427783 Hs.77910 3-hydraxy-3^ethytglutaryl-Coenzym9 A synthase 1 (sotuble) 2*16 

116355 AA504356 Hs*8650 ESTs 2*13 

45 115316 AA280627 H&57846 ESTs 2*06 

129677 U46736 Hs.198891 serine/thraonirie-protein kinase PRP4 homoiog 2* 

130971 H20332 H&28707 signal sequence receptor; gamma (transJocon-assodaied protein gamma) 2.799 

115054 AA252863 Hs*7729 ESTs 2.795 

130285 AA063548 Hs£Q2968 ESTs 2.792 

50 124308 H93575 H&227146 Homo sapiens mRNA; cDNA DKFZp564J142 (from clone DKFZp564J142) 2.783 

125502 AA732329 Hs.191959 ESTs 2.778 

114800 AA159825 Hs.131887 ESTs; Weakfy similar to ORF YNL227c [Sjcerevislae] 2.768 

128625 AA242816 Hs.102652 ESTs; WeaJdy srniQar to WAAQ437 [Usapiens] - 2.766 

130159 H51098 Hs.151310 PDZ domain protein (DrosophBa InalHike) 2.75 

55 107127 AA620504 H&22119 ESTs 2.742 

113547 T90746 Hs.15233 ESTs 2.734 

104639 AA004622 Hs.1 82 14 ESTs 2.727 

127609 AA622559 Hs.150318 ESTs 2.726 

106922 AA490954 Hs.1 0056 ESTs 2.725 

60 124825 R52088 yg85c3.s1 Soares infant brain 1NIB Homo sapiens cONA clone 2.725 

124333 H98683 Hs.154054 ESTs 2.708 

117634 N36421 Hs.107854 ESTs; WeaJdy simiiar to SODIUM- AND CHLQRIDE-DEPENDENTGLYGNE 

TRANSP 2.706 

101609 M54927 Hs.1 787 proteolipid protein 1 (PefizaBUS-Merzbacher disease; spasfic paraplegia 2; 

65 uncomplicated) -2.704 

117142 K96908 Hs»2251 ESTs 2.7 

112602 R79147 H&203365 ESTs 2*95 

106828 AA481505 Hs.1 3797 ESTs 2*8 

124377 N25996 Hs. 179333 ESTs 2*75 
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101026 J04970 caiboxypeptHase M 2X75 

124560 N66393 * Hs.102754 ESTfi • 2X75 

124066 H02494 Hs.101615 ESTs 2.671 

130281 R12777. H&15395 ESTs; WeaMy similar ta ARGINYL-TRNA SYNTHETASE [H.sapiens] 2.60 

110949 N496Q2 Hs. 13308 ESTs 2X5 

111031 N54839 H&2210B5 ESTs; Highly simQar to mediator [H^aplans] 2X33 

121770 AA421714 Hs.1 1469 K1AAQ898 protein 2X3 

134132 U32519 H&220689 Ras-GTPase-ectivating protein SH3-domairhttruJjng protein 2X26 

112424 R62452 Hs.191265 ESTs 2X25 

122544 AM51679 Hs.194410 ESTs 2.625 

134425 X90568 Hs.172004 titin 2.624 

111114 N63391 HSX238 ESTs 2.619 

116119 AA459242 Hs.44445 ESTs; WeaJdy similar to Ketch motif containing protein [H.saplens] 2.615 

112079 R44164 H&23014 ESTs 2X 

123033 AA481271 Hs.193945 ESTs 2X91 

124196 H52617 Hs.144167 ESTs 2X88 

125873 H14437 yf25a04 j1 Soares breast 3NbHBst Homo sapiens cONA dona 2X8 

117684 N40184 H&45050 ESTs 2X75 

134938 D30037 Hs.1 68326 phosphotidytinositol transfer protein; beta 2X75 

131822 AA215647 H&200332 ESTs 2X68 

135185 U71203 HsX6Q38 Rfc (DrosophIla)-fike; expressed in many tissues 2X64 

117690 N40467' HsX3834 ESTs . 2X57 

118807 N78582 HsX0732 protein kinase; AMP-activated; beta 2 non-catatytlc subunfl 2X52 

121369 AA40S657 Hs.128791 Human DNA sequence from done B67N21 on chromosome 20p12X-13. Contains 2X5 

114860 AA235112 Ha. 106227 ESTs; Moderately similar to similar to murine RNA-btrciing protein [H.sapians] 2X49 

121857 AA426017 Hs.62694 ESTs; Highly simflar to DNA-REPAJR PROTEIN COMPLEMENTING 2X48 

110190 H20560 Hs.244624 ESTs 2X48 

132573 AA045333 HsX1743 ESTs; Weakly similar to II ALU SUBFAMILY SB2 WARNING ENTRY 1! [H^apfens] 2X42 

109708 F09729 Hs.12780 ESTs 2X37 

135109 AA410391 HsX4592 Wotho 2X25 

132810 R37027 HsX737 WAA0475 gene product 2X25 

124879 R73588 Hs.101533 ESTs 2X25 

103840 AA174190 HsX0932 ESTs 2X25 

119068 R22196 HsX4492 ESTs 2X19 

114833 AA234362 HsX7310 ESTs; Moderately simflar to CGl-66 protein [Ksapiens] 2X07 

112998 T23555 Hs.103288 ESTs 2X 

123312 AA488258 Hs.99601 ESTs 2.499 

121873 AA426270 Hs.145696 splicing factor (CC1X) 2.491 

123321 AA496884 H&23972 ESTs 2491 

107760 AA018042 HSX5078 EST 2483 

102580 U60808 Hs.152981 CDP-C4acytgryceroi synthase (ptosphatidaie cytidytyltransferase) 1 2481 

103053 X56741 HsX947 mei transforming oncogene (derived from cell One NK14>- RAB8 homofog 2475 

124758 R38100 Hs.106294 ESTs 2.475 

112938 T15665 HsX185 ESTs; Weakly similar to BcONA.GH12174 |D Jnelanogaster] 2475 

125178 W58202 Hs.125731 ESTs 2.475 

112423 R62447 Hs£2123 ESTs 2471 

123515 AA600323 Hs.1 12535 EST 2462 

102842 U95020 H&21903 calcium channel; voltage-dependent; beta 4 subuna 2457 

102400 U42390 Hs.171957 triple functional domain (PTPRF interacting) 2455 

113187 T56058 HsX992 ESTs 2452 

131687 L11066 HsXQ69 heat shock 70kD protein 9B (mortaIIn-2) 2448 

115314 AA280583 Hs£56501 ESTs 2437 

128211 AI206427 Hs.1 66707 ESTs; Highly simflar to Ran-Wndlng protein 2 [H^apiens] 243 

134281 L11005 HsX1047 aldehyde oxidase 1 2425 

115985 AA447709 Hs.132094 ESTs; Moderately similar to putative transcription factor CA150 [H .sapiens] 2.425 

111348 N 90041 HSJ9585 ESTs 2418 

129430 AA258842 Hs.1 97877 Homo sapiens done 23777 putative transmembrane GTPase mRNA; partial cds 2418 

133863 C13990 Hs.76930 synudeln; alpha (non A4 component of amyloid precursor) 2417 

111164 N66857 Hs.14808 ESTs; Weakly similar to U ALU CLASS C WARNING ENTRY H [H.sapiens] 2416 

132143 AA257058 Hs.7972 KIAA0871 protein 2.412 

130330 M55047 Hs.154679 synaptotagmin 1 2408 

114219 Z39451 Hs£7389 ESTs 2406 

117101 H94043 Hs*4341 DKFZP58611419 protein 2403 

125433 AA034325 HsX4320 ESTs 24 

111099 N62506 H&21958 ESTs 24 

120323 AA195405 Hs.1 10347 Homo sapiens mRNA for alpha integrin binding protein 80; partial 2X97 

118624 N69998 Hs21801 ESTs 2X94 

123570 AA608955 H&109653 ESTs 2X89 

123562 AA608893 Hs.190065 ESTs 2X88 
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131546 AA262821 H&28578 musdebOnd prosophSaHke 2.385 

103143 X66141 Hs.75535 myosin; Kght porypepfide 2; regulatory; cardiac; slow 2384 

123645 AA609310 Hs.188691 ESTs 2383 

130123 AA001835 Hs.150390 zinc finger protein 262 2379 

131682 AA428368 H&30654 ESTs 2378 

115909 AA436666 HS39761 ESTs . 2375 

125168 W45574 Hs252497 ESTs 2372 

123973 C14805 Hs.182151 ESTs 2361 

135197 076456 Homo sapiens tissue inhibitor of metafloproteinase 4 mRNA, complete cds 2357 

118689 N71545 Hs.1 84544 ESTs 2357 

107734 AA016225 Hs.93386 ESTs 2354 

124590 N 59220 Hs.41381 ESTs; Weakly similar to ufalquffin hydroJyzing enzyme I [Rsapiens] 235 

111163 N66850 Hs.17606 ESTs 2348 

112349 R58877 H&22665 ESTs; Moderately similar to (1J83L6.1 [H^apians] 2345 

129076 AA262179 Hs. 169343 ESTs 2345 

134238 RSI 509 Hs. 184571 splicing factor; arglnlna/serrne-rich 1 1 2341 

116766 H13260 Hs.95097 ESTs 2336 

106331 AA436853 HS34795 ESTs 2333 

129003 AA443752 Hs.10784 ESTs 2332 

132368 AA599814 Hs.46637 ESTs; Weakly similar to cDNA EST yk289g55 comes torn this gene [C.etegans] 2332 

124697 R06273 Hs. 186467 ESTs; Modly smlr to II ALU SUBFAMILY J WARNING ENTRY 1! [H^aplens] 2322 

120273 AA176688 H&221139 ESTs 2313 

127110 AA304993 Hs.100861 ESTs; Weakty similar to p60 katanin [H^apiens] A 2307 

105450 AA252621 H&93842 ESTs 2301 

119819 W74371 Hs38383 ESTs 2297 

102302 U33052 Hs39171 protein kinase Olika 2 2288 

130598 N74353 Hs.16475 ESTs £282 

114161 Z38904 Hs22385 ESTs; Weakly similar to KIAA0970 protein [H.saptens] 2278 

130542 U64675 Human sperm membrane protein BS-63 mRNA, complete cds 2277 

104491 N71513 Hs39328 ESTs 2275 

116988 H82527 ys69e12.s1 Scares retina N2b4HR Homo sapiens cDNA clone 2275 

126823 AA370120 Hs.7870 ESTs; Weakly similar to YIr350wp [S^cerevtsiae] 2273 

108800 AA129731 Hs£0424 ESTs 2273 

101310 U1607 HsS34 glucosamlrryl (N-acetyT) transferase 2; 1-branching enzyme 2269 

126842 W19498 HS21085 ESTs 2255 

127251 AA936428 Hs.128638 ESTs 2251 

124647 N91947 Hs. 125033 ESTs 2249 

127112 AI143906 Hs.125103 ESTs 2247 

101973 S82597 Hs30120 UDP^cetyl-a[pha-r>gaiact^ 2246 

120999 AA398302 Hs.127437 ESTs 2245 

130225 AA599583 Hs. 15299 HMBA-inducIbie 2243 

119980 WB8678 Hs249247 heterogeneous nuclear protein slmflar to rat heDx destabilizing protein 2243 

124222 K61053 Hs222844 ESTs 224 

129199 H90914 Hs.128629 ESTs 2236 

106802 AA479101 Hs.16570 ESTs; Weakly similar to I! ALU SUBFAMILY SQ WARNING ENTRY I! [Haptens] 2231 

126160 N90960 Hs247277 ESTs; Weakly similar to transfofrnation-related protein [Rsapiens] 2229 

104627 AA001976 Hs.19603 ESTs 2228 

106474 AA450212 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564C053 (from done DKFZp564C053) 2226 

113096 T40927 Hs3345 ESTs 2225 

135336 AA452822 Hs.99027 ESTs 2225 

135344 R82976 Hs.168491 ESTs; Moderately similar to TRF1-lnteracitng ankyrin-reteted 2225 

126156 AA508354 Hs.1 18448 ESTs; Moderately similar to AKT3 protein kinase [Haptens] 2222 

128885 AA397841 Hs.180141 cofiOn 2 (muscte) - 2218 

107900 AA026385 Hs.1 76600 ESTs; Moderately similar to ll ALU SUBFAMILY SB2 WARNING 2217 

114481 AA033562 Hs.151572 ESTs 2212 

109292 AA199828 Hs.1 88662 ESTs 2212 

104257 AF006265 Hs.9222 estrogen receptor-binding fragment-associated gene 9 2209 

132932 T15482 Hs.6093 ESTs 2204 

127392 AA262728 Hs.14898 Homo sapiens clone 24590 mRNA sequence 2204 

104641 AA004652 Hs.1 8594 ESTs 22 

122529 AA449828 Hs.99229 ESTs 2.195 

124307 H93582 Hs.1 62395 profine synthetase co-transcribed (bacterial homolog) 2.193 

133601 S95938 Hs.75155 transferrin 2.193 

119904 WB5709 Hs.128927 ESTs; Weakly similar to U ALU SUBFAMILY SP WARNING ENTRY 0 [KsaptensJ 2.192 

100348 D64109 Hs.4994 transducer of B*BB2; 2 (TOB2) 2.185 

126871 AA351779 Hs200334 ESTs 2.18 

127793 A1298335 Hs30445 ESTs; Weakly similar to transcription regulator Staf-50 (Rsaplans) 2.178 

105149 AA169253 Hs3958 ESTs 2.177 
121367 AA405648 zw39gta1 Soare8.totalJetus_Nb2HF8_9w H sapiens cDNA clone IMAGE772478 2.177 
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126202 AA652238 
115955 AA446121 
104164 AA458770 
106692 AA121270 
122878 AM55341 
134771 L13939 
104298 D31120 
104840 AA039595 
122180 AA435798 
131012 K01992 
134092 H17490 
118617 N69666 
107155 AA6212Q2 
130925 N71935 
135167 U63717 
105952 AA405263 
110308 H38148 
116368 AA521186 
132939 U76189 
117881 N50073 
121723 AA419622 
103500 Y09443 
121429 AA406293 
134632 AA398710 
129785 F10980 
111065 N58193 
114710 M129931 
132711 N73702 
133377 R05490 
124773 R40923 
117759 N47587 
127386 AW57411 
101167 L15309 
109597 F02582 
124390 N29325 
116225 AA478609 
131243 R16667 
130557 T90830 
134103 D14828 
108833 AA131866 
112286 R53765 
125624 AA165411 
124612 N72200 
116335 AA495830 
112248 R51361 
115789 AA424754 
107029 AA599219 
110294 H30270 
120532 AA262354 
116180 N59249 
132018 AA293194 



Ha25119 ESTs 2.175 

H&237225 ribosomal protein S5 pseudogene 1 2.175 

Hs.145053 ESTs 2.175 

H&239666 ESTs 2.175 

Hs.41749 protaJn kinase; cGMP^dependerrt; type II 2.161 

H&4815 nudbc (nucleoside diphosphate linked moiety X)*type motif 3 2.157 

H&25324 ESTs 2.157 

Hs£9559 KIAA1096 protein 2.156 

Hs.73980 troponin T1; skeletal; slow 2.155 

H&8078 Homo sapiens mRNA; cONA DKFZp586L081 (from clone DKFZp586L081) 2.155 

H&9552 binder of Ari Two 2.151 

Hs.11 7211 ESTs; Highly similar to CGJ-62 protein [H^apiens] 2.15 

HSU33921 Clontech adult lung cONA library (HL1158a) Homo sapiens cDNA 2.15 

Hs.12432 Homo sapiens clone 24407 mf&lA sequence 2.15 

Hs.159640 serum/glucocortfcoJd regulated kinase 2.15 

Hs.41327 ESTs 2.15 

Hs227676 ESTs; Moderately similar toil ALU SUBFAMILY SQ 2.137 

Hs.75295 guanytate cyclase 1 ; soluble; a!pha 3 2.137 

Hs.199726 ESTs 2.135 

H&44198 Homo sapiens 8 AC done RG054O04 from 7o31 2.134 

H&27023 K1AA0917 protein 2.132 

H&82960 ESTs A 2.128 

H&99640 ESTs 2.126 

HSX8576 . adaptor-related protein complex 1; beta Isubunlt 2.125 

H&4036B adaptor-related protein complex 1; slgma 2 subunit 2.125 

H&4245B Homo sapierus mRNA; cDNADKFZr^ 2.125 

HsJ8835 ESTs; Moderately similar to putative ring zinc finger protein 2.125 

H&2Q2949 KIAA1 102 protein 2.125 

Hs.7905 ESTs; Highly similar to sorting nextn 9 [Rsaplensl 2.123 

Hs.183413 ESTs; Modify smlr to D ALU SUBFAMILY J WARNING ENTRY B [Ksaplensl 2.123 

Hs.7946 DKFZP586D1519 protein 2.12 

Hs.169378 multiple POZ domain protein 2.12 

H&95821 osteoclast stimulating factor 1 2.118 

Hs.181400 ESTs 2.109 

H&32775 ESTs 2.108 

HsX4217 ESTs 2.107 

Hs.61152 exostoses (muWpteHlke 2 2.102 

Hs.84926 ESTs; HighJy simlar to B-IND1 protein [M.rnuscuius] 2.1 

Hs.104800 ESTs; WeaWy simBar to Mouse 195 mRNA; complete cds [M jnuscutus] 2.096 

H&22580 aikytglycerone phosphate synthase 2.094 

Hs.183498 ESTs 2.093 

Hs.174139 chloride channel 3 2.091 

Hs.184780 ESTs 2.09 

Hs.18740 ESTs; Weakly similar to 1-evtdence 2.089 

Hs.79081 protein phosphatase 1; catalytic subunlt; gamma isoform 2.083 

Hjl238927 ESTs 2.083 

Hs.7239 SEC24 (S. cerevisfee) rotated gene family; member B 2.079 

Hs.106604 ESTs 2.078 

H&97345 ESTs; Weakiy similar to TROPOMODUUN [Rsapiens] 2X76 

Hs.106728 ESTs 2.076 

Hs.193677 zinc finger protein 141 (clone pHZ-44) 2X75 

Hs.14474 ESTs 2.074 

Hs.7535 ESTs; Highly simflar to COBW-Gke placental protein [Hsaplens] 2.07 

Hs.47278 Human Chromosome 16 BAC clone CIT887SK-A-735G6 2.07 

Hs*4752 spectrin SH3 domain binding protein 1 2.069 

Hs.15981 ESTs; WeaWy similar to ine-1 protein ORF2 [H^aplens] 2.067 

Hs.155924 cAMP responsive element modulator 2X64 

Hs.61661 ESTs; Weakly simflar to DY3.6 [Ceiegans] 2.063 

Hs.158135 WAA0981 protein 2.063 

zq48a01j1 Stratagene hNT neuron (#937233) flomo sapiens cDNA done 2.061 

Hs.13913 ESTs 2X58 

HSJ7013 ESTs 2X57 

HS23423 ESTs 2X56 

Hs.43149 ESTS 2X56 

Hs.187492 ESTs; Weakly simflar to ALR (H^aplens] 2X56 

Hs.165062 ESTs 2X54 
Hs.166648 ESTs 2X54 
H&48349 ESTs 2X52 
Ha3737 ESTs 2X52 
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132617 AA171913 HsX338 caibonfeanhydraseXll 2X5 

131528 N36167 K&28274 ESTs 2X5 

113254 T64438 Hs.1 1449 DKFZP56401 23 protein 2.05 

122785 AA459978 Hs.99508 ESTs 2.05 

107203 020426 H&5656 EST 2.05 

105713 AA291321 Hs.184319 ESTs; Moderately similar to K1AA1008 protein [H^apiens] 2.046 

129385 D82675 Hs.110950 Homo sapiens done 25007 mRNA sequence 2X42 

119116 R43845 H&645S5 DKFZP566E2346 protein 2.04 

116405 AA600253 H&55601 EST* Highly similar to host ceD factor 2 [Rsaptens] 2.04 

125924 AA526849 Hs.82109 syndecanl 2X39 

105599 AA279442 Hs.143460 protein kinase C; nu 2.037 

119741 W70205 Hs.43870 Hresm family member 3A 2.037 

101449 M21494 H&1 18843 creatine kinase; muscle 2.036 

107109 AA609943 HS32793 ESTs 2X34 

117040 K89112 yw25e5.s1 Morton Fetal Cochlea Homo sapiens cDNA done IMAGE25328 £034 

132906 AA142857 Hs.234896 ESTs; Highly similar to gemtnin [H.saplens] 2X31 

105479 AA255548 H&23467 ESTs 2.027 

102031 U04698 H&2156 RAR-related orphan receptor A 2.027 

119846 W80363 H&58446 ESTs 2X24 

124809 R46482 Hs.106875 ESTs 2X24 

130286 AA041548 Hs.154023 KIAA0573 protein 2.023 

124457 N50114 Hs.128704 ESTs 2X17 

125144 W37999 H&24336 ESTs 2X17 

120581 AA281257 Hs.125868 ESTs 2X14 

104931 AA062731 Hs.108319 thyroid hormone receptor-associated protein; 150 kDa subunft 2X12 

120548 AA278846 Hs.187634 ESTs 2.011 

113933 W81362 Hs.30567 ESTs 2.011 

123072 AA485041 Hs.1 04308 ESTs 2X09 

123648 AA609323 Hs.1 12689 ESTs 2X08 

116875 H67749 Hs.161022 EST 2.003 

103179 X69398 H&82685 CD47 antigen (Rh-related antigen; integrln-assodated signal transducer) 1X95 

103478 Y07755 H&38991 S100 calcium-binding protein A2 1X95 

111007 N53378 H&22543 ESTs 1.995 

120470 AA251797 zs11f3^1 NCI_CGAP_GCB1 Homo sapiens cDNA done 1X89 

112280 R53457 Hs56040 ESTs; Weakly similar to fatty add omega-hydroxylase [H^apiens] 1X89 

114127 Z38652 Hs.1 06961 ESTs; WeaWy similar to TYL [H.saplens] 1X88 

129863 AA151005 Hs.129872 sperm surface protein 1X88 

106320 AA436608 ESTs 1X88 

108933 AA147224 Hs.71814 ESTs 1X86 

105906 AA401633 H&22380 ESTs 1X82 

109029 AA157911 Hs.72200 ESTs 1X82 

118470 N66769 HsX2781 ESTs 1X75 

115358 AA281886 HsX8923 ESTs 1X75 

115257 AA279060 Hs.193516 6-ceD Ctl/tymphoma 10 1X74 
126879 AA719776 zh38g04.s1 Soaresj)in8aLflIand_N3HPG Homo sapiens cDNA done IMAQE:414390 1 X74 

109547 F01479 * H&26986 ESTs 1X73 

127111 AA805728 H&220509 ESTs 1X69 

101266 L36645 Hs.73964 EphA4 1X66 

129319 AA037467 H&30340 ESTs 1X65 

106211 AA428240 H&126083 ESTs 1X62 

112753 R93696 Hs.169882 ESTs 1X61 

120489 AA255538 Hs.1 90504 ESTs 1X59 

129699 AA458578 Hs.1 2017 KIAA0439 protein; homobg of yeast ubiquRin-proteln Dgase Rsp5 1X56 

105425 AA251129 H&24416 ESTs 1X53 

134740 L37362 H&X9455 opioid receptor; kappa 1 1.95 

109324 AA210700 HsX6405 Homo sapiens mRNA; cDNA DKFZp564P056 (from done DKFZp564P056) 1.95 

124303 H93043 Hs.1 07070 ESTs 1X5 

102337 U36922 Human fork head domain protein (FKHR) mRNA, 3* end 1X48 

109441 AA228100 KsX6998 nudsar factor of activated T-ceBs 5 1X46 

127364 AA179573 Hs.90061 progesterone binding protein 1X42 

105255 AA227498 H&3623 ESTs 1X42 

130672 L19783 Hs.177 phosphatidyteitol giycan; class H 1X42 

104301 D45332 Hs.6783 ESTs 1X4 

132442 R62589 Hs.167419 ESTs 1X39 

105519 AA258063 H&23438 ESTs 1-937 

132902 AA490969 H&168147 ESTs 1.936 

118873 N89881 Hs.44577 ESTs 1X36 

114124 Z38595 Hs.1 2501 9 ESTs; Highly similar to KIAA0886 protein [H.saptens] 1X34 

115075 AA255486 H&X8045 ESTs 1.933 
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110895 H93463 Hs.124777 ESTs 1531 

105380 AA236209 Hs, 187626 ESTs 1531 

124898 T56013 Hs.77910 34iyfoxy-3^myta^^ 1529 

121816 AA424814 Hs.187509 ESTs 1527 

111717 R23241 Hs.1 10778 STAT induced STAT lnhibitor-2 1525 

128874 H08245 H&1O6801 ESTs 1525 

109331 AA219599 Hs.1 84245 K1AA0929 protein Msx2 interacting nuclear target (MINT) homotog 1513 

126129 K82165 Hs.40334 ESTs 1511 

115553 AA369027 Hs.71414 ESTs 1505 

113811 W44928 Hs.4878 ESTs 1505 
108345 AA070906 zm66dU1 Stratagene neuroepQhetium (#937231) Homo sapiens cONA clone 1504 

120472 AA251875 H&104472 ESTs; Weakly similar to Gag-Pol poiyproteln [M.muscutus] - 1503 

116602 080063 H&241673 EST 1501 

121121 AA399371 Hs.1 89095 ESTs; Weakly simBar to zinc finger protein SALL1 [H^apfens] 15 

125330 AA401804 Hs.114574 ESTs 1596 

130095 F01831 Hs.14838 ESTs 1594 

119782 W72982 H&58262 ESTs 1594 

104115 AA428090 Hs.26102 ESTs 1593 

131313 C17938 H&22370 Homo sapiens mRNA; cONA DKFZp564O0122 (from clone DKFZp564O0122) 1591 

105583 AA278907 Hs24549 ESTs 1591 

122825 AA461195 H&99580 ESTs 1587 

119495 W35390 Hs.55533 ESTs 1586 

130309 AA134289 H&15423 Homo sapiens BAC clone RG114B19 from 7q31.1 1586 

125628 AA418069 Hs241493 natural kffler-tumor recognition sequence 1586 

110611 H66947 Hs.14671 ESTs; Highly similar to gene ERCC5 protein [H^apiens] 1585 

117301 N22569 H&43215 ESTs 1584 

131406 N92239 Hs56471 Writ Inhibitory factor-1 1581 

128428 AA013312 Hs.64988 ESTs 1581 

120285 AA182882 Hs.1 11110 trfin-cap (telethonin) 1578 

112724 R91753 Hs.17757 ESTs 1578 

103121 X63679 Hs.4147 taslocaling chabvassodafing membrane protein 1575 

124381 N26765 Hs.1 09008 ESTs 1575 

117226 N20468 Hs.1 77322 EST s; Weakly similar to putative p150 [H^apiens] 1575 

105610 AA279991 Hs.124691 ESTs; Weakly similar to trithorax homologue 2 [H^aplens] 1575 

111229 N69113 Hs.1 10855 ESTs 1575 

120627 AA285079 Hs.1 90474 ESTs 1573 

107048 AA600012 Hs. 10669 ESTs; Moderatety similar to KIAA04QO [Hjsapiens] 1572 

104041 AA3819Q2 Hs.197114 RNA binding protein 1572 

115162 AA258366 H&227806 ras GTPase activating proteWIka 1572 

102239 U2672B Hs.1378 hydroxysterold (11 -beta) denydrogenase2 157 

100043 M10098 AFFX controt 18S ribosomaJ RNA 1568 

120298 AA191353 Hs32385 ESTs; Weakty simliar to KIAA0970 protein (H sapiens] 1567 

129011 S72869 Hs.1 07932 DNA segment; single copy; probe pH4 (transforming sequence; thyrokM ; 1587 

134851 R44479 Hs.90232 KIAA0552 gene product 1566 

117392 N26175 Hs.93405 ESTs 1564 

114530 AA053027 Hs.191797 ESTs 1563 

123541 AA608794 Hs.1 12592 ESTs 1563 

124890 R78618 Hs54145 ESTs; Weakly simitar to RAS-RELATED PROTEIN RAB-8 [H^apiens] 1562 

105299 AA233511 Hs.1 94720 ATP-binding cassette; sub-family G (WHITE); member 2 1561 

103560 220656 Hs.1 82787 myosin; heavy poiypept 6; cardiac muscle; alpha (cardiomyopathy; hypertrophic 1) 1561 

113073 T33637 Hs.6841 ESTs 158 

120407 AA235040 Ha 107283 ESTs 1559 

103892 AA243523 Hs.17155 ESTs - 1558 

123795 AA620381 Hs.70488 ESTs 1557 

108524 AA084323 Hs.68138 ESTs 1557 

113953 W85812 Hs.187554 ESTs 1556 

110721 H97678 Hs51319 ESTs 1556 

129426 AA412087 Hs.1 68272 EST; Highly smlr to prot Inhibitor of activated STAT prot PIASx-alpha [H^apiens] 1553 

112102 R44840 H&21303 ESTs 1552 

118502 N67317 HsJOISO ESTs O 1552 

107619 AA004955 Hs50015 ESTs 1551 

100438 D87446 H&75912 KJAA0257 protein 155 

120652 AA287312 Hs.191648 ESTs 155 

121643 AA417078 Hs.1 93767 ESTs 1543 

117387 N26011 H&53810 ESTs 1543 

132084 Y12394 Hs5886 karycpherin alpha 3 (importin afcha 4) 1543 

124449 N48593 Hs.121620 ESTs 1541 

12)263 AA173440 Hs.193919 ESTs 1538 

127226 AA731038 Hs5463 ribosomaJ protein S23 1538 
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111837 R38447 H&24453 ESTs 1335 

128727 M64174 H&50651 Janus kinase 1 (a protein tyrosine kinase) 1 £34 

114439 AA018937 Hs, 128629 ESTs 1333 

102332 U35637 Human nebuBn mRNA, partial cds 133 

126579 W72979 Hs.146082 ESTs 133 

102341 U37122 H&8110 adducin 3 (gamma) 133 

114246 Z39848 Hs.12079 ESTs 1328 

131757 D17532 H&316 DEAD/H (Asp-GtihAla-Asp/Hls) box polypeptide 6 (RNA helicase; 54kD) 1323 

108904 AA138521 Hs.71148 ESTs; Wealdy similar to putative p150 [Rsapiens] 1323 

115084 AA255566 Hs.42484 Homo sapiens mRNA; cDNA DKFZp564C053 (from done DKFZp564C053) 1323 

131957 AA609008 Hs.183232 ESTs 1322 

100131 D12485 Hs.11951 phosphodiesterase l/nudeofide pyrophosphatase 

1 (homologous to mouse Ly-41 antigen) 1 322 

124163 H30539 Hs. 189838 ESTs 1321 

118204 N59859 Hs.48443 ESTs 1321 

107727 AA016021 Hs. 173091 DKFZP434K151 protein 132 

100357 D78156 Hs^41 548 RASp21 protein activator 2 132 

116295 AA489016 H&91216 ESTs; Highly stmSar to partial COS; human putative tumor suppressor [H^aplens] 132 

124833 R54112 Hs.128697 ESTs 1317 

122587 AA453255 Hs3968 ESTs 1317 

114359 Z41589 Hs.153483 ESTs; Moderately similar to H1 chtoride channel [H^apiens] 1315 

111289 N72253 H&238246 ESTs 1313 

110826 N30068 Hs. 15347 ESTs 1312 

104106 AA422123 Hs.42457 ESTs 1311 

130043 AA055404 Hs. 193953 ESTs; Weakly simOaf to 0 ALU SUBFAMILY J WARNING ENTRY U [Ksaplens] 1253 

115864 AA432080 H&31200 ESTs 131 

129737 AA056140 Hs.122684 ESTs 131 

124477 N53158 Hs, 102682 ESTs 1309 

100782 HG374O+TT4O10 Basic Transcription Factor 2, 34 Kda Subunlt 1306 

106101 AA421053 Hs.34395 ESTs 1306 

115479 AA287696 zs52h093l NCLCGAP.GC81 H sapiens cONA clone IMAGE:701153 1304 

116104 AA456635 Hs.78524 ESTs 1304 

114173 Z39050 Hs.21963 ESTs 1304 

132632 N59764 Hs.5398 guanine-monophosphate synthetase 1303 

119135 R49548 Hs. 159681 death effector domain-containing 1302 

131559 N91087 H&28728 ESTs; Weakly stmDar to F55A123 [Celegans] 1301 

126922 AA1 77138 H&161671 ESTs 13 

117375 N25427 Hs.108812 ESTs 13 

103571 Z25535 H&211608 nucteoportn 153kD 13 

105978 AA406367 Hs.15973 ESTs 13 

125904 H22372 Hs.163586 ESTs 1.799 

133883 AA397915 Hs.77221 chorine kinase 1798 

105777 AA348412 H&23096 ESTs 1.797 

110166 H19480 Hs.174309 ESTs 1.796 

105038 AA1 30273 Hs.7584 ESTs; Wealdy similar to hypothetical protein; similar to [Rsapfens] 1.796 

105427 AA251330 Hs.28248 ESTs 1.795 

115278 AA279757 Hs.67466 ESTs; Weakly similar to BACN32G1 1 d [D jnelanogaster] 1.794 

133104 L13698 Hs35029 growth arrest-speaTic 1 1.794 

131170 N48674 H&23796 Human DNA sequence from clone 1052M9 on chromosome Xq25. Contains the 1.792 

100136 D 13540 Hs.22868 protein tyrosine phosphatase; non-reoeptor type 11 1.791 

127263 AA331157 EST35035 Embryo, 6 week, subtracted (total cONA) 1 Homo sapiens cONA 1.79 

114157 Z38878 Hs£4979 ESTs 1.79 

125601 A1096717 Hs.247043 KIAA0525 protein - 1.788 

118472 N66818 Hs.42179 ESTs 1.787 

112456 R63925 Hs.28484 ESTs 1.787 

130236 N69682 HS31957 SC35-tnteracting protein 1 1.786 

133297 AA6O0O57 Hs.70266 KIAA0905 protein 1.784 

125650 R40096 Hs.176578 ESTs 1.784 

132056 T89386 H&38176 K1AA0606 protein; SCN Circadian Oscfllatory Protein (SCOP) 1.783 

129093 AA262710 Hs.108614 KIAA0627 protein 1.783 

123176 AA489020 Hs.193424 ESTs 1.782 

106340 AA441792 H&22857 chord domain-containing protein 1 1.781 

100598 HG2463-HT2559 Guanine NucteoticMinamg Protein G25k 1.779 

104038 AA374532 ESTB6676 HSC172 cefe I Homo sapiens cONA 5* end, mRNA sequence 1.778 

122235 AA438475 Hs/1 90104 ESTs 1.777 

105104 AA151771 Hs.76941 ATPase; NaVK+ transporting; beta 3 polypeptide 1.776 

107601 AA004636 HS30223 ESTs 1.776 

131467 W68255 H&27194 DKFZP434K1 71 protein 1.776 

118449 N66413 Hs. 172466 ESTs; Weakly similar to WAA0775 protein [Rsapiens] 1.776 
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107869 AA034030 Hs.155212 methylmatonyl Coenzyme A mutase 1.775 

115527 AA342079 H&252055 ESTs 1.775 

132471 T16305 Hs.49349 beta-site APP-claaving enzyma 1J75 

105968 AA406105 H&5344 adaptor-relaiBd protein complex 1; gamma 1 suburb 1.774 

127548 AA373091 Hs.93832 Homo sapiens done 24483 unknown mRNA; parital cds 1.774 

106217 AA428379 H&24670 ESTs 1.773 

131214 N26777 Hs.172635 ESTs 1.773 

106295 AA435664 H&J583 similar to AP08EC1 1.773 

106328 AA436705 Hs28Q20 KIAA0766 gene product 1.772 

124661 N93797 Hs.3090 EphB1 1.772 

122988 AA479166 Hs.105633 ESTs 1.772 

115504 AA291646 Hs.42733 ESTs 1.771 

105168 AA180208 Hs.16606 ESTs; Highly similar to CGK32 protein [H sapiens] 1.767 

129153 AA188618 Hs.161461 artadna; DrosophBa; homotog of 1.766 

105829 AA398290 H&21965 ESTs 1.764 

101811 M86917 H&24734 oxystsrot binding protein 1.764 

100138 D13628 H&2463 angtopotetm 1 1.764 

124704 R07335 ye96cU1 Soaresfetal Over spleen INFLSHomo sapiens cDNAclone 1.763 

122314 AA442257 Hs. 192076 ESTs 1.762 

109865 H02566 Hs.191268 Homo sapiens mRNA; cDNA DKFZp434N174 (Irom done DKFZp434N174) 1.761 

106206 AA428069 HsJ9519 KIAA1 046 protein 1.758 

107135 AA620782 H&23247 ESTs 1.757 

105760 AA338960 H&28170 ESTs 1.756 

106288 AA435536 H&24338 ESTs 1.756 

103968 AA304566 H&3542 ESTs 1.756 

129559 AA234945 Ks.11360 ESTs 1.758 

117885 N50112 Hs.47023 ESTs 1.754 

107032 AA599472 H&247309 succinate-CoA lipase; GDP-forrnlng; beta subunit 1.754 

124807 R45963 H&233811 ESTs; Weakly similar to ORF2 [Mjnusculus] 1.753 

100276 D42047 Hs.82432 KIAA0089 protein 1.753 

110924 N47938 yy84a09^1 Soares_muttip!e_sclerosls_2NbHMSP Homo sapiens cDNA dona 1.751 

133002 AF006082 Hs.62461 ARP2 (actin-related protein 2; yeast) homotog 1.751 

132530 AA455917 H&50785 SEC22; vesicle trafficking protein (S. cerevisiaeHite 1 1.75 

110759 N21671 Ks.19025 ESTs 1.75 

106138 AA424515 Hs.33264 ESTs 1.75 

107348 U43701 Ks.184776 ribosomal protein L23a 1.75 

115867 AA432162 H&165986 DKFZP586B2Q22 protein 1.749 

135398 AA194075 Hs.99908 nuclear receptor coacfivator 4 1.747 

113783 W19222 Hs.7041 ESTs; Weakly similar to 8 ALU SUBFAMILY SQ WARNING ENTRY I! [H^aplens] 1.747 

134898 X98330 Hs.90821 ryanodine receptor 2 (cardiac) 1.745 

132215 T10132 Hs.4238 K1AA047B gene product 1.744 

104229 AB002346 Hs.61289 synaptojanin 2 1.743 

116166 AA461558 H&202949 KIAA1 102 protein 1.743 

115433 AA284252 H&58372 ESTs 1.743 

114906 AA236545 H&54973 ESTs 1.742 

127425 AA470941 Hs.143162 ESTs 1.741 

131089 Z38807 H&22870 ESTs 1.739 

113498 T68908 Hs.189746 ESTs 1.738 

116710 F10577 Hs.70312 ESTs 1735 

127210 R51476 yg76f04j1 Scares Infant brain 1 NIB Homo sapiens cDNA done 1.733 

120554 AA279654 Hs.194524 ESTs 1.733 

129940 U18242 Hs. 13572 calcium modulating Sgand 1.732 

117023 H88157 H&41105 ESTs * 1.731 

111700 R22212 H&23361 ESTs 1.731 

116911 H72240 H&39292 ESTs; Moderately similar to KIAA0745 protein [Rsapfens] 1.731 

106025 AA412063 Hsj6Q65 ESTs 1.728 

106628 AA101984 Hs.61697 Grprotsin coupled receptor 1.726 

111614 R12561 Hs.191146 ESTs 1.726 

134134 L76703 Hs, 173328 protein phosphatase 2; regulatory subunit B (B56); epsflcn isoform 1.725 

106886 AA489088 H&36545 ESTs 1.725 

117998 N52136 Hs.93828 ESTs 1.725 

121204 AA400422 H&55896 ESTs 1.725 

121342 AA404995 Hs.192480 ESTs 1.725 

131129 R27296 H&23240 ESTs 1.725 

116235 AA479181 Hs.186726 ESTs 1.725 

102423 U44754 Hs.179312 small nuclear RNA activating complex; polypeptide 1 ; 43kD 1.724 

110273 H29050 H&24096 ESTs 1.722 

108758 AA127395 H&222414 ESTs 1.722 

110672 HB8477 Hs.191178 ESTs 1.721 
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120271 AA176404 Hs.1 11092 ESTs; Weakty similar to ZINC FINGER PROTEIN 136 [tisaplens] 1.72 

100227 D28915 Hs.82316 interferon-induced; hepatitis (^associated microtubular aggregate pro* (44kD) 1.719 

129232 W69459 Hs.1 09655 sex comb on midleg (DrosophflaHike 1 1.719 

134663 W73367 Hs*750 ESTs 1.717 

5 1049G2 AAQ55475 Ha 1 04143 dathrfn; Ught polypeptide (Lea) 1717 

120582 AA281290 Hs.1 25287 ESTs; Weakly similar to BC331 191 J [H^aplens] 1.717 

134891 F03517 Hs*0787 ESTs 1.716 

106219 AA428567 Hs£6613 Homo sapiens mRNA; cDNADKFZp586F1323 (from done DKFZp58BF1323) 1.715 

116372 AA521311 Hs.13854 ESTs 1.713 

10 107570 AA001870 H&237323 N-acetytgfuoosamlne-phosphate mutase; DKFZP434B187 protein 1.713 

106198 AA427816 Hs.11803 ESTs 1.712 

125138 W31479 Hs.129051 ESTs 1.712 

104973 AA085676 Hs.6763 K1AA0942 protein 1.712 

128710 J04813 Hs.104117 cytochrome P450; subfamily IIIA (niphedipine oxidase); polypeptides 1.711 

IS 123994 D20899 Hs.1 07127 Homo sapiens mRNA; cONA DKFZp564Q022 (from done DKFZp564G022) 1.711 

127871 AA766511 Hs.128848 ESTs 1.71 

116089 AA455933 H&41324 ESTs 1.709 

123337 AA504153 Hs.132797 ESTs; Weakly similar to ORF YGL050w [Sxerevisiae] 1.708 

123819 AA609200 Hs.162686 ESTs 1.708 

20 104781 AA026617 H&21610 ESTs; Highly similar to BAl1-assodated protein 1 [Haptens] 1.707 

115114 AA256468 H&88148 ESTs 1.705 

117852 N49408 Hs.1 38102 K1AA0853 protein 1.705 

127644 T57570 Hs.77039 ribosomal protein S3A 1.704 

111359 N91273 Hs.27179 ESTs 1.702 

25 131721 136644 H&31092 EphA5 1.7 

132438 F08925 Hs.48810 ESTs 1.7 
132478 N67192 Hs.49476 Homo sapiens done 7UA8 CrWu-chat region mRNA 1.7 
130990 F02488 H&21917 WAA0768 protein 1.7 
128499 AA487503 Hs.100636 ESTs 1.698 

30 120780 AA342337 H&241569 ESTs; Modify sntt to i! ALU SUBFAMILY SQ WARNING ENTRY II [tisapiens] 1.697 

132920 L06133 Hs.606 ATPase; Cu++ transporting; alpha polypeptide (Menkes syndrome) 1.696 

135037 U77948 Hs.184122 general transcription factor II; I 1.696 

110024 H11297 Hs-31050 ESTs 1.695 

134415 AA329274 H&82911 protein tyrosine phosphatase type IVA; member 2 1.694 

35 102223 U24685 Hs.148226 Human anthB cell autoantibody IgM heavy chain variable V-DJ region (VH4) 

gene; done E1 1; VH4-63 non-productive rearrangement 1.694 

126712 AA205862 Hs.7942 ESTs 1j694 

101507 M27492 H&82112 irrterteuWn 1 recepton type I 1-692 

106291 AA435551 H&30824 ESTs 1.691 

40 116826 H58691 H&8215 ESTs; Weakly similar to (kwble-stranded RNA-bindlng nuclear 

protein DRSBP76 [Rsapfens] 1.69 

135339 D59269 Hs.1 27842 Homo sapiens mRNA fell! length Insert cONA clone EUROIMAGE 783648 1.69 
1 18250 N62602 yz75b6.s1 Soares_muttiple_sderosls„2NbHMSP Homo sapiens cDNA done 

IMAGE288851 3' simflar to contains Afu repefflve element;, mRNA sequence 1.689 

45 106470 AA450116 Hs.186180 ESTs 1.688 

108203 AA057678 Hs.63408 ESTs 1.687 

119748 W70313 H&126906 ESTs 1.686 

116576 051228 H&79404 neuron-speafic protein 1.683 

123035 AA481392 Hs.105166 ESTs 1.683 

SO 126668 AA011616 Hs.1 84086 ESTs 1-681 

101512 M28209 Hs250716 RAB1; member RAS oncogene family 1.678 

102704 U76638 Hs£4089 BRCA1 associated RING domain 1 1.677 

126218 AA256388 Hs.13649 Novel human gene mapping to chomosome 13; simBarto rat RhoQAP 1.676 

111160 N67277 Hs.9403 ESTs 1.676 

55 105937 AA404342 Hs.173531 ESTs 1.675 

114118 238520 Hs. 175930 ESTs 1.675 

109203 AA1 90634 Hs.1 08787 endoplasmic reticulum membrane protein 1.675 

12S245 W86608 H&7243 ublquitin specific protease 24 1.675 

102906 X06956 Hs.75318 tubulin; alpha 1 (tesfc specific) 1.675 

60 125914 AA262925 Hs.180034 deavage stimulation factor; 3 1 pre-RNA; subunit 3;77kD 1.674 

134294 U63289 H&81248 CUG triplet repeat; RNA-binding protein 1 1.674 

109742 F10108 Hs.183333 ESTs 1.673 

134674 D6387B H&87726 KIAA01 54 protein 1.673 

104079 AA402837 Hs.1 03238 ESTs 1.671 

65 107554 AA001388 H&59844 ESTs 1-671 

132439 AA243139 Hs.4863 Homo sapiens dona 25088 mRNA sequence 1j669 
124515 N58172 Hs.1 09370 ESTs 1-668 
124300 H92575 Hs.105959 ESTs; WeaWy simfer to 0 ALU SUBFAMILY SQ WARNING ENTRY H lUsapiens] 1.663 
126809 AA743475 Hs.171693 ESTs 1.667 
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106095 
101754 
105188 
113582 
119559 
119961 
123255 
111078 
113082 
119589 
104308 



AA419547 

M77142 

AA192306 

T91371 

W38197 

WB7535 

AA490890 



T40528 
W44692 



124424 
128890 
119400 

131631 
118229 
118533 
130666 
103093 
128667 
112933 
114546 
126705 
114399 



X59417 
N35314 
AA096157 
T92767 

AA486868 



N67954 

AA476307 

X60708 



100401 
105681 
132526 
133809 
115968 
116370 



T15530 

AA056263 

AA579377 

AA007595 

N79820 

085423 

AA284865 

AA460128 

AA034002 

AA447083 

AA521256 



109644 F04477 



103427 
132186 
131428 



X97303 



114503 
121242 
122414 
110632 
111389 
112449 
113070 
107229 
132710 
124664 
130166 
125040 
132972 
115873 
120408 



U17838 

AA549257 

AA039568 

AA400857 

AA446885 

H72344 



R63802 
T33464 



115259 
134330 
115117 
125162 
103946 



115528 
129704 
109313 
130457 
123076 
115113 
117731 



W83726 

N94814 

AA350690 

T78451 

H39627 

AA433916 

AA235045 

AA383773 

AA279071 

D20113 

AA256492 

W44682 

AA285246 

AA166917 

AA342301 

W81301 

AA206800 



HS.11713 
H&239489 
H&23926 
Hs.16824 

H&59015 
Hs.105273 
Hs.186574 
HsX246 
Hs.124177 
Hs.77904 
Hs.74077 
Hs.107265 
Hs.162364 



Hs29802 

Hs.180532 

Hs.49413 

Hs.194035 

Hs.44926 

H&103419 

H&221439 

Hs.132747 

Hs.180532 

H&220937 

H&50854 

Hs.171228 
Hs5074 
Hs.76359 
Hs.134522 
H&236204 



ESTs 

TIA1 cytotoxic granule-associated RNA-binding protein 

ESTs 

EST 

Accession not listed in Qenbank 

ring finger protein 9 

ESTs 

ESTs 

ESTs 

ESTs 

rtoosomal protein S26 

proteasome (prosome; macropain) subunh; alpha type; 6 
ESTs 

ESTs; Weakly similar to 25 kOa trypsin Inhibitor [Rsapiens] 

ye27oms1 Stratagene lung (#937210) Homo sapiens cONA done 

I MAGE: 1 18955 3\ mRNA sequence. 

s&tpiosophlto}ruxnolog2 

heat shock 90kO protein 1; alpha 

ESTs 

KIAA0737 gene product 

dipeptidylpeptidase IV (C026; adenosine deaminase complexlng protein 2) 

fasctculation and elongaton protein zeta 2 (zygin It) 

ESTs 

ESTs 

heat shock 90kD protein 1; alpha 

ESTs 

ESTs 

Homo sapiens mRNA tor Cdc5, partial cds 
WAA1040 protein 
similar to S. pombe dim1+ 



1.664 

1.663 

1.663 

1X61 

1X61 

1X57 

1j657 

1.655 

1X54 

1X52 

1X5 

1X5 

1.65 

1X5 

1.65 

1.65 

1X49 

1.648 

1.647 

1.647 

1X46 

1.646 

1X45 

1.644 

1X42 

1X4 

1X4 

1.639 



ESTs 



AA485211 
AA256480 
N46433 



ESTs; Moderately similar to NUCLEAR PORE COMPLEX 
PROTEIN NUP107 (Riwrveglcusl 
H&2048Q2 ESTs; Moderately similar to GLYCERALDEHYDE 3-PHOSPHATE 
DEHYDROGENASE; UVER [ftsapiens] 
Rsapiens mRNA for Ptg-12 protein 
KIAA1038 protein 

PR domato containing 2; with ZNF domain 
ESTs 
ESTs 
EST 

ESTs; Moderately similartoZINC FINGER PROTEIN 141 [H^apiens] 
ESTs 

ESTs; Weakly similar to UB2A p jnelanogaster] 
ring finger protein 2 
ESTs 
ESTs 

protease inhibitor 5 (maspin) 
ESTs; Weakly similar to WAA0765 protein [H^apiens] 
KIAA0916 protein 
ESTs 

ESTs; Weakly similar to I! ALU SUBFAMILY SB WARNING ENTRY g [H.sapiens] 
heat shock 70kD protein 4 
ESTs 
ESTs 

splicing factor 3b; subunit 1 ; 1 55kD 
ESTs; Highly simitar to CGM4 protein [Ksaplens] 
poly(A) polymerase 
ESTs 

ESTs; Weakly simBar to Prtl homotog [KsapfensJ 
ESTs 

ESTs; Weakly similar to II ALU CLASS B WARNING ENTRY II [risapiens] 
ubiquiSn specific protease 22 

ESTs; Moderately similar to zinc finger protein dp [H^apiens] 
cuEBn4B 
ESTs 
ESTs 
ESTs 



H&221040 

Hs£6719 

Hs.188602 

Hs.188083 

HSX7509 

HSX9087 

Hs.171635 

Hs.169111 

Ha.124186 

Hs.6298 

H&34644 

HSX5279 

HSX3540 

Hs.151411 

Hs.199961 

Hs.164967 

HSX0093 

Hs.190151 

Hs.191500 

Hs.13453 

HsX185 

Hs.49007 

Hs.109896 

Hs.111650 

Hs.72639 

HsX3929 

Hs.12064 

HSX6276 

Hs.155976 

Hs.190046 

Hs.44610 

Hs.46609 



1.637 

1.631 

1X27 
1X27 
1X26 
1X26 
1X25 
1X25 
1X25 
1X25 
1X24 
1X24 
1X23 



1X18 

1X17 

1X17 

1X16 

1.615 

1X15 

1X11 

1.61 

1X1 

1.609 

1.607 

1.606 

1X05 

1.604 

1.603 

1.602 

1.602 

1X01 

1.6 

1X 

1X 

1X 
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123344 AA504338 Hs.171857 ESTs 1599 

131793 X88093 Hs3238 adenovirus 5 E1A binding protein 1597 

125370 AA256743 Hs. 151 791 KIAA0092 gene product 1598 

114918 AA236813 Hs.72324 ESTs; Highly similar to unknown [hUapiens] 1598 

5 114807 AA160805 Hs.199832 ESTs 1598 

106103 AA151593 Hs.10130 ESTs 1594 
125004 T6012O yb68KK.s1 StraJagene ovary (#937217) Homo sapiens cDNA clone 

IMAGE76347 3\ mRNA sequence. 1592 

105658 AA282914 H&10176 ESTs 1589 

10 110455 H52172 yt85e8.s1 SoajBS_pIneaLflland_N3HPG Homo sapiens cDNA done 

(MAGE23111 3' similar to contains Alu repetitive element mRNA sequenoe 1589 

119780 W72967 Hs.191381 ESTs; Weakly similar to hypothetical protein [HUapiens] 1587 
128983 AA211537 zn55d01 Jl Stratagene muscle 937209 Homo sapiens cDNA clone 

IMAGE562081 5*. mRNA sequence. 1588 

IS 134675 AA250745 H&87773 protein kinase; cAMP-dependent; catalytic; beta 1584 

106431 AA252033 Ks.15036 ESTs; Weakly sfrnflar to 1! ALU SUBFAMILY J WARNING ENTRY II [HUapiens] 1584 

120187 Z40251 H&56974 ESTs 1584 

115830 AA428137 H&86434 ESTs 1581 

135069 AA456311 Hs£39S1 ESTs; WeaWy similar to 11 ALU CLASS A WARNING ENTRY II [HUapiens] 1581 

20 122897 AA479295 Hs.106290 Kelch motif containing protein 1581 

119707 W67569 H&44143 ESTs; Weakly sfmflar to SNF2afpha protein [H^aplens] 158 

131934 D80948 H&34922 ESTs 158 

106141 AA424558 H&9302 phosducm-Dke 158 

115271 AA279422 Hs5724 ESTs 1579 

25 131468 R27598 H&27197 KIAA0797 protein 1577 

131165 R98173 H&23763 Max-interacting protein 1575 

117273 N21680 Hs.43047 ESTs 1575 

101569 M33772 Hs.182421 troponin C2; fast 1575 

116127 AA459703 Hs.78070 v-myc avian myelocytomatosis viral oncogene homotog 1575 

30 120022 W90625 HS584S2 ESTs 1575 

117512 N32157 H&B2207 ESTs 1574 

106511 AA452865 H&206713 UDP-GattetaGlcfJAcb«ta1;4-3^^ 1573 

116415 AA609204 H&27973 KIAA0874 protein 1573 

127879 AA810215 Hs.189079 ESTs 1571 

35 125211 W72798 Hs. 103 177 ESTs; WWy smlr to cDNA EST EMBLD32579 comes from this gene [Celegans] 1571 

114746 AA135638 H&223756 ESTs 1571 

122698 AA456112 HSJ98410 ESTs 157 

116765 H12636 Hs.121585 ESTs; Weakly similar to reverse transcriptase [H^aptens] 1568 

130895 AA609828 Hs21015 ESTs; Highly similar to tetracycline transporter-like protein [MjtuiscuIus] 1568 

40 114338 Z41366 Hs.40109 K1AA0872 protein 1567 

111005 N53076 Hs5S96 ESTs 1567 

128135 AA913491 Hs.189143 ESTs; Modrtty smlr to D ALU SUBFAMILY J WARNING ENTRY II [Usapiens] 1567 

112046 R43365 H&22273 ESTs 1566 

132160 AA281770 H&.184081 seven in absentia (Drosophila) homotog 1 1566 

45 111568 R10153 H&20561 ESTs 1566 

127775 H04106 Hs.179902 ESTs; Weakly similar to NG22 IHxapfens] 1566 

115359 AA281936 H&88914 ESTs 1568 

121845 AA425734 Hs. 165066 ESTs; WeaWy similar to hypothetical protein [H-saptens] 1565 
1Z7854 AA769520 ESTs; WeaWy simiiar to REGULATOR OF MITOTIC SPINDLE 

50 ASSEMBLY 1 [HUapiens] 1564 

120287 AA187679 Hs.111114 ESTs 1563 

114940 AA243012 Hs.75928 ESTs 1562 

126716 AAO31700 Hs251962 ESTs * 1562 

134161 U97188 Hs.79440 IGF-il rnRNA-bindlng protein 3 1561 

55 125390 H95094 Hs.75187 transtocase of outer rnftochondriai membrane 20 (yeast) homotog 1561 

115334 AA281244 Ks.65300 ESTs 1559 

113721 T97931 Hs.18190 EST 1558 

114895 AA238177 Hs.76591 KIAA0887 protein 1558 

119341 T62571 H&.146388 mlcrc&bule-asscciated protein 7 1558 

60 108012 AA039616 Hs51933 ESTs 1558 

130335 AA156499 Hs.8454 protein kinase; cAMP-dependent; regulatory; type II; alpha 1557 

134351 R82074 H&82109 syndecanl 1557 

133300 D51401 Hs7G333 ESTs 1553 

106920 AA490899 Hs24462 ESTs 1553 

65 118744 N74075 Hs.84293 EST 1552 

126489 W20016 Hs.144228 ESTs; Weakly similar to ZINC RNGER PROTEIN 83 [HUapiens] 155 

115913 AA436720 Hs.65487 ESTs 155 

107868 AA025234 Hs51260 ESTs 155 

134520 N21407 Hs£57325 ESTs 155 

224 



WO 02/30268 PCT/US01/32045 



109703 F09684 Hs24782 ESTs; Wealdy state toORF YOR283w [S.cerevisiae] 155 

120268 AA187938 H&55189 ESTs; Weakly similar to F25B5.3 [Celegans] 1S43 

106356 AA443277 H&31034 peroxisomal biogenesis factor 11 A 1548 
120460 AA235627 Hs.11171 APQ5 (autophagy 5; S. cerevislaeHke 1547 

133950 011961 Hs.77823 ESTs 1546 

126172 A1400862 H&142607 ESTs 1546 

114162 Z38909 H&22265 ESTs 1545 

101803 M86546 Hs.155591 pre-frcel! leukemia transcrtpfion factor 1 1544 

113817 T93630 Hs.17207 ESTs 1542 

104896 AA054228 H&23165 ESTs 1541 

114477 AA032013 Hs. 144260 EST 154 

110731 K98653 Ks. 188006 K1AAD878 protein 154 

130367 238501 H&5768 ESTs; Wkty smlr to H ALU SUBFAMILY SQ WARNING ENTRY 0 [Rsaptensj 1538 

130539 107044 H&250857 Homo sapiens calcium^calmafuIlTHlepenoant protein kinase II mRNA; partial cds 1538 

134921 W60186 Hs. 169487 Kretster (mouse) maf-related leucine zipper homoiog 1537 

130583 W24957 Hs. 16281 ESTs; Moderately similar to similar to Celegans protein 

encoded in cosmid T20D3 [H.sapiens] 1537 

133723 AA088851 Hs.75744 S-adenosytmathlonine decarboxylase 1 1537 

106450 AA449469 Hs.11859 ESTs 1536 

104120 AA429838 H&89519 KIAA1 046 protein 1538 

100533 HG1879-HT1919 Ras-Uce Protein Tc10 1535 

130664 RQ9049 Hs, 17625 ESTs 1535 

127122 AA279153 Hs.190049 ESTs 1535 

134264 T03391 Hs5087 ESTs 1535 

132319 AA418662 Hs.44625 ESTs 1535 

115465 AA286941 Hs.43691 ESTs 1533 

125003 T59442 Hs, 100445 ESTs 1532 

102273 U30888 Hs.75981 ublqurBn specific protease 14 (tRNA^uaninetransglycosyfase) 1532 

121875 AA426299 H&98510 ESTs 1532 

114366 Z41747 Hs.469 succinate dehydrogenase complex; subunft A; flavoprotein (Fp) 1531 

132944 AA054515 Hs.6127 ESTs; Weakly similar to prastate-spedfic transglutaminase [H^aplens] 153 

111199 N68210 Ks£9822 ESTs 153 

113494 T88878 H&258738 ESTs 1529 

128515 AA490882 Hs.112227 ESTs 1528 

133124 AA156049 Hs.65490 ESTs 1528 

104785 AA027163 Hs.7942 ESTs 1526 

105595 AA279408 Hs25866 ESTs 1526 

130198 U67156 Hs.151988 mltogen-activated protein kinase kinase kinase 5 1526 

114297 Z40758 Hs.173091 DKF2P434K151 protein 1525 

112876 T03488 H&4842 ESTs 1525 

127500 AA525014 Hs.162115 ESTs 1525 

120519 AA258585 Hs.129887 cadhertn 19 (NOTE: redefinition of symbol) 1525 

119859 W80702 Hs58461 ESTs 1525 

129944 L00389 Hs.1361 cytochrome P450; subfamily I (aromatic cornpound-inducible); polypeptide 2 1524 

118854 N8967D Ha42148 ESTs; Weakly similar to Su(P) [D jnelanogaster] 1523 

123964 C13961 H&210115 EST ' 1523 

111676 R19414 Hs.166459 ESTs 1522 

128332 A1079523 H&134173 ESTs 1522 

130455 X17059 Hs.155958 N-acetyttrarisfGrase 1 (arybmine N-acetyltransferase) 1521 

125181 W58461 Hs.12396 ESTs 1521 
127093 AA768241 oa72d02s1 NCi_CGAP_GCB1 Homo sapiens cONA done 

IMAGE1317795 3\ mRNA sequence. 1521 

132156 AA157401 Hs.4113 S-adenosylhomocystelne hydrolase-Gke 1 - 1521 

125303 Z39821 Hs. 107285 ESTs 152 

132697 AA281951 Hs5518 Homo sapiens mRNA; cDNA DKFZp566J2146 (from done DKFZp568J2146) 152 

117086 H93135 Hs.41840 ESTs 1519 

113355 T79203 Hs.14480 ESTs 1518 

108621 AA101811 Hs59506 ESTs 1518 

109384 AA219172 Hs.66849 EST 1518 

128510 X94703 Hs.100816 RAB28; member RAS oncogene family 1517 

132968 N77151 Hs.61638 myosin X 1515 

117035 H88798 Hs41182 ESTs 1515 

116781 H22985 H&52132 ESTs 1513 

108677 AA115629 Hs.118531 ESTs 1513 

130214 H78003 Hs.15266 ESTs 1513 

134700 AA481414 H&8868 golgi SNAP receptor complex member 1 1512 

116618 D80783 H&45224 ESTs 1508 

126257 N99638 tumor necrosis factor receptor superfamOy; member 10b 1503 

125859 AA806808 Ha.1 1 8797 ubquffirHXxi]ugaling enzyme E2D 3 (homologous to yeast UBC4/5) 1508 
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113837 W57698 Hs5888 ESTs 1^07 

114317 Z41038 H&469 succinate dehydrogenase complex; subunS A; flavoproteki (Fp) 1507 

100311 D50640 Hs. 184653 phosphodiesterase 3B; cQMP-inhibted 1.507 

126802 AA947601 Hs.97056 ESTs 1506 

128681 R82837 Hs. 103329 WAAQ970 protein 1506 

134194 AA233231 Hs.79828 ESTs 1506 

108953 AA149652 Hs.42128 ESTs 1504 

133240 D31161 Hs.68613 ESTs 1502 

132671 X76302 Hs54649 putative nucteic acid binding protein RY-1 1501 

132609 Z48923 Hs-53250 bone morphogenetfc protein receptor; type II (serine/threonine kinase) 1501 

105574 AA27867B H&258567 ESTs 15 

113718 T977B2 H&25G268 ESTs 15 

127824 AI208365 Hs.1 27811 ESTs 15 

130132 U55938 Ks.184376 synaptosomahissociated protein; 23kD 15 
127394 AA453224 ESTs; WeaWy similar to n ALU SUBFAMILY J WARNING ENTRY D [Ksapiens] 15 

100485 HG1111-HT1111 Ras-LBa Protein Tc21 15 

101078 L04510 Hs.792 ADPffoosyfatjon factor domain protein 1; 64kD 15 

128611 AA456845 Hs.102471 WAA0680 gene product 15 
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TABLE 12A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 12. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



10 



15 



50 



55 



60 



Pkey. 

CAT number 
Accession* 



Unique Eos probeset identifier 
Gene cluster number 
Genbank accession numbers 



Pkey CAT number Accession 



108536 119811J 
117040 46956 J 
20 100782 18457J 



100819 3022.1 



100824 5.38 



25 



30 



125004 264197J 
35 102313 27608J 
102337 553_1 



40 



45 



124704 292319J 
116988 165904 J 
124825 330773J 
110455 46874J 
126257 182217J 
125624 154135J 
104038 264235 J 
103427 43892J 



104142 113242 J 
127093 47721J 



AA084524AA339253AW966289 

AW970600 AA503323 H89218 AFD86031 HS9112 

AA355435 NMJ001516 Z30093T28405 AW949486 AA461142 AA410532 Al 652073 AA521208 A1970H1 AI96B234 AI026102 
AA7 1 3583 AW1 35876 AA9366 1 4 AA770300 AE42635 AA377033 AW96Q263 AW607683 AI273803 AM1 0287 AI04051 3 
AA460838 AI803916 AW294095 AW448680 AW798677 AWS75048 BE5421 16 AL120521 

L3484ONMJJ03241 U31905 A1546931 AI791616 A1973065 AI792321 A1546937 AI685880 AI732835 AI682360 AA420653 
AA564047 A1682323 AI824614 AJ659889 AI680052 A1970887 Al 6231 08 AA420692 A1418074 AA631018 A1810595 AW291463 
AW449930 AI668908 AI970818 

AI393237 A1521317 AI761348 AF025841 D43968 AW994987 L34598 AF025841 D89789 D89788 D89790 AW998932 
AI971742 A1310238 X90976 AW139668 AW674280 AI365552 AA877452 AV657554 C75229 AA376077 AI798056 AW609213 
W25586 H30149 BE075089 BE075190 AW580858 H99598 AA425238 AA133916 AW363478 BE158121 BE158127 
AW467960 BE158135 BE158126 BE158145 N92860 AA847246 AW6168B AI361423 AA878154 AA043767 A/863712 
AI559226 AW339007 AJ371268 A1368901 AA046624 AA134739 AW449154 AA130232 AI458720 AAS6251 1 AI700627 
R70437 AW0O4O08 AA045229 AI671572 K99599 AA043768 AI685454 AI871685 N29937 X90977 AA524240 AI142114 
AI825750 AI567805 AI631 365 AI347893 AA1 34740 F20669 AA046707 AW79321 6 AW963298 AW959380 AA363265 
AI784593 AI268201 R69451 AV657618 AI695588 

BE312163 AJ23079B AA374482 AI926059 AA622653 A1860704 BE139185 AW296884 T60238 T60120 
U33921 AI190489 AA573311 

AI814663 AA806761 AA76S241 AA019317 AA092255 AA035405 T85079 AA89015 1 AI373959 TB5080 BE153728 AA740848 
BE080682 AL048137 AW 18231 6 A1699468 AW274481 AW407538 AA3Q6562 AW950024 AW949943 AL045703 AW84319Q 
W25132 BE612794 AA304266 AW958054 H25673 AV646563 AV646573 BE172990 AW593488 AAS85181 AA164998 
AK46478 AA345406 AI277554 AA134749 AA856624 BB613247AA299003 AL048138 AA028121 T92510AI923835 
AW02O440 A1401594 AI889401 N93290 AA044247 AAQ28100 AI582845 AA81 1 151 AI741811 A1S25878 AA448277 AA172221 
AI214783 BE220793 AA022746 AIQ32882 AA022849 AI928385 AA573472 AI420686 AW0729Q2 AI799493 A1873506 
AI468977AJ192079 AW68976 AAD44272 AW015701 AW316979 AA933042 AA609017 AI318393 AI424571 A1934945 
AA172023 AW050917 AA846180 AA134748 A1003947 AT766769 AW006697 AA653517 AW575680 A1474214 AA401478 
U36922 AA927064 AA868000 D62654 T91745 AW500202 AA194764 AA746346 AA130464 AW1 17498 AA054526 N26432 
HQ2534 H04964 AW303387 BE300931 A121 8049 A1208073 AW1 82749 AA983630 AI1 47585 AA1 94765 AA054534 AA922720 
AI436585 A1346535 AA134269 AA280923 AA897422 AA019559 AW274010 AA035406 AA917879 H99327 W32908 Ai2 16046 
AW496823 AA01 9414 H82288 W35284 A1936621 AI767113 AA866177 AW367874 H82398 AF032685 AW300151 AW467069 
AAB09346 All 88507 A1494178 AA872752 Al 531631 U 023 10 NM.00201 5 AA815006 A1382453 AW197658 AI761654 
AI804396 A1382221 AIS13640 AI439635 AI523901 AW517242 AI221705 AW298104 AW204560 AW573095 AW028783 
AW01 4650 AT766744 AI808294 AB98758 AI041 809 AI766667 AI479103 AA872797 AA769305 AA765080 AA334168 
AI472322 
R 07335 R07640 

AW953579 AW953580 AA244436 H82527 AA381046 AA244483 H82526 

AA501669 R52088 

H52576AF085971H52172 

N99538 AW973750 AA328271 H90994 AA558Q20 AA234435 N59599 R94815 
AW968363 AA465492 R34539 AA165411 
AA374532 AA421255 

BE514383 AA071273 AW247987 AW673286 BE312102 AW749824 BE071985 AW577383 BE071945 BBJ72005 AW577355 
BE071865AW239231 BE072000 BB071860 AW577360 AW749830 AVW73020 X97303 AW999522 BE000192 BB62219 
BE266655BE264970 
AA074713 AA447006 

AW977549 AA256038 AL365415 AW500455 AA768241 AW968097Z17849 AA256104 
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45 
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55 
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127210 15307.6 



10 



15 



232161J 

20 135197 29440J 



127394 
126879 



25 120470 
127854 
121367 
106320 



115479 
101026 



304844J 

1860^2 

171841J 

188975J 

443883 J 

280429J 

6435J 



201515J 
11075J 



125873 10492.1 AW271838 AL1 33605 C01 646 H29959 AA999896 D60876 AW939454 AW961 178 AA315244 H 14437 AW3861 18 N46512 

AW272021 AI768516 BE466421 AI082809 AI804454 AA905101 AW1 73388 N36942 AW614169 AI080483 N29489A1500550 
AA994475 AA614484 AA707368 AA593145 AA569473 AW627815 AI828244 N63226 N42300 
125954 4457.1 NM.016353 AB023584 W44753 R09585 AA382865 R23772 AI814257 AA974048 AK001608 AI935838 AW440609 AI420022 

AA777386 AA806969 AI554876 AI584006 AI886556 AI688634 AI697997 AI014540 A1806683 AJ741202 AW263154 
AW297238 Al 149 951 A1589076 AW0B2158 AW814265 AA931887 AA781969 R09490 AA484643 A12071 21 AI088390 
A1538065 AI619547 A1741 925 AI702846 H40846 R93943 AW747979 AA461346 U301 63 AA326Q23 AI535992 AW242B70 
Ai244025 AI222558 W38425 AW473630 AI624599 AI921226 AI683152 AI096458 AI123822 AW170802 C16447 AI337674 
D25726 AW339368 AW771259 AA481174 
H48372W01628 
AA305278 AA223833 

110924 6443J AVW)58463 AF1 95768 AA6801 45 T86901 W60373 W60281 NMJXJ7222AF1 06862 AI000795AA1 671 88 
AW884503 AW891313 AW891332 AW891312 AI984924 AI123518 N75170AA1 31614 H25330 AI913358 AI742277 W25576 
R58771 AW445159 AW888628 AW888627 AW274674 AI088482 N52314 N34282 AW001769 AI338943 T6S784 AI288963 
AW468676 AW237528 H25289 N71690 AA610128 AI143458 AI082599 N49144 AA854773 AW66341 1 AW610151 N47938 
AW601626 AA167189 AA918304 AA805205 BB069498 AA652836 BE069499 AI699298 AW249928 AW888578 BE567635 
T10726 AW604715 D54245 D53062 D55610 D55555 AA301376 AI133498 N777B8 AI936320 AW090734 A1269977 N50828 
AA550814 AI421993 AI005384 N50813 D6Q292 D59349 AA131710 D81698 D81699 
AA331 156 AA331 157 AA331 155 

U76456 NM.003256 AF057532 AA193414 AW293304 AW963378 AA313095 AI359841 AI969312 AI080163 AW448928 
AI671136 BE466399 AI637967 AK71873 AW196583 AW071635 AI634427 AW298872 AW292470 AA193650 
BE161832 AA453224 AA485772 
D90391 M55575 AI652268 AA719776 
AA524886 AW971347 AA211537 
AW971327 AA524988 AW628653 AA251797 
AW976796AA769520 
AA432071 AA405648 AW000908 T16347 

AB028957AL120001 AI267678 H1 0928 R1 9844 AW970334 AA393182 F05472 F1 1711 H09908 K50250 A1815411 BE463679 
061468 AW970253 060889 C15548 061011 D60867 AI815795 AA534831 D81386 AW235039 A138215B 081 174 AA416699 
AA852310 H09789 H10929 K09813 F09369 R44721 D51515 Z38456 R14004 T66255 F12148 F12139 AW351702 M85350 
AKJ18713 AW972450 AW972645 AA514964 T66172 F09785 F09776 AA436608 T05327 T071 1 8 AA339352 
AW301608 N46706 AA649093 AA287595 AW81 1753 AA287596 N39260 

NM.001874 J04970 T91426 AW205201 T84979 AA255727 AA847837 R021 64 T91339 AV651884 AV651835 AV651350 
AV6501 18 AV651338 A12720Q2 AI367796 AA830651 AA262112 AW151198 

AU076696 AA219720 AL135197 AA305877 N56376 AA318063 AA130725 AW954903 BE541230 AW383312 U88753 085423 
AI679458 A1 122932 AB007892 A1583919 BE160134 F08104 R34903 F1 3440 AA095444 AA262453 AA1 91036 R17895 
T81268 BE149776 AE79537 A11431 13 AA361072 AW959030 AW268817 AA81 1533 BE275179 AI221677 T65147 R49293 
AA249176 BE000290 AA768053 F09494 BE092845 BE172099 Z41 177 AA044750 AI909768 BE140795 BE140574 AW845210 
AW752452BE243244 AA843664AI300080BE1 69032 AW189979 BE004869 AA821872 AI951772 AI678897 AI926598 
N6281 3 AI35091 2 AW808791 AI309602 AI9831 38 AW875592 AI655073 AW875626AA1 30606 A137Q827 C75528 C75554 
AW263335 A1344426 BE004788 AA576220 AA604824 A1431405 AA749378 R38882 AW955075 AA173821 G75657 
AA219672 AW768408 R43141 AI431414 AA483343 A1S73792 T17294 AW7701 87 N74285 AI476404 A1088288 AA654152 
AW974864 BB617311 BE243328 BE168049 
130542 28089J3 U64675 A W1 67507 AW 167508 BE2185S8 AA779360 W85722 AL044843 BE153404 AF012088 AW89861 1 AW89861 0 

BE159405 BE092191 AW890826 AW369841 AW3S8064 AW5067Q2 AUD44731 R82691 AA419346 AA416558 H96045 
AL040450 AI640531 AI8Q8434 AL046613 AW855784 AW382469 AL048881 AL049015 AA094272 AA888908 AA417294 
AW237786 R59793 AL044916 082402 AK16854 AKJ79342 H96406 AL037845 A191 5900 AA9721 33 AI478783 T31074 
221135 Z21395 AA352182 R13918 AA430178 C17811 A1371824 AI742256 AA926801 N79156 AA350610 AA081971 N83639 
R35544 AA312292 AW952080 N42322 AA171957 AA565297 R89207 AA504106 AI630782 AAB26482 AI301579 T36241 
AW966618 228426 AL043480 A1124636 AA393449 T1 9504 AW887823 AI289814 N53979 AL043571 A1632764 AI859613 
AI98S308 AJS83212 A1984499 AJ133258 C05838 AW512761 AI041260 BE466240 219161 A1351190 N67549 AI373374 
AA400373 AW440914 AW514879 AA770146 AI358754 R51 1 13 A1283773 AA649886 T30543 054358 R37750 T03358 
T15451 T15880 AA999689 N67396 AI056289 T85597 N62441 R89099 R00035 T85596 R61335 R00128 N83359 AI535964 
AI207768 M31468 NM.012250 W0 1322 AA253280 AA253233 AA293148 AW582106 R79880 AA459547 AA363459 
AA234396 N31669 H44468 AA434587 AW363088 AW993541 
AA070906AA070934 

X51501 NM.002652 Y10179 J03460 A1791618 AI821473 AA916588 AA564298 AA916110 AI972286 AI420470 AI568790 
A1597724 AW205207 AI659305 AI791620 AA532383 A1821475 AA526498 
32905J NM.012249 M31470 AL0431 08 AA262561 AA178883 T29433 AA313329 W48807 AW404323 AA453560 AW403227 K94816 

W17101 AA165152 W23989 AA091310 
2&Q2Jt AL121734 054896 AA424269 BE242906 AA362118 BB016454 AI280348 AL048769 M35543 AA757734 Al 128865 H2Q289 

H23728 AI203445 H41481 H18237 H44081 H92839 AI928621 H75675 D51 148 AT796198 AW390453 055579 054145 D53996 
054015 R37664H17541 AA668681 T65061 R15867 AW468123 R16049 H69030 AA054228 H16070 F09655 R92144 T0^21 
R05473 H92840 AA018186 R91707 
65 102332 14745.3 U35637AA1 12989 219308 
118250 genbanK_N626G2 N62602 
103678 entre^Z84483 284483 
119400 08nbankJ92767 T92767 
119559 entna^W38197 W38197 



35 100401 24827J 



40 



100485 30576J2 



112277.6 
19669.1 



100522 
100533 
100598 
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TABLE 13: shows genes, including expression sequence tags, up-regulated in prostate tumor 
tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos Hu02 GeneChip 
array. Shown are the ratios of "average" normal prostate to "average" prostate cancer tissues. 
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Pkey: 


Unique Eos probeset Identifier number 




ExAccn: 


Exemplar Accession number, Genbank accession number 




UnfgenelD: 


Uaigsne number 




Unigene Titie: 


Unigene gene fitle 




R1: 


Background subtracted normal prostate : prostate tumor tissue 




Pkey ExAccn 


UnlgenelD Unigene Tine 


R1 


333516 


CH23_FGENES.173_1 


0.028 


337954 


CH22 W EMACOO55OO.G0JSCANS6^ 


0.029 


332496 R73299 


Hs.204354 ras homotog gene family; member B 


0X8 


337944 


CH22_EMAC005500.GENSCAN.B9-7 


0.033 


334111 


CH2?J=GENES-330J0 


0.033 


333657 


CH22J=GENES241_2 


0.034 ^ 


327718 


CH.04_hsgIj6525284 


0.034 


336355 


CH23_FGENES.817J5 


0.035 


322011 AL137354 


EST cluster (not In UniGene) 


QJ035 


336377 


CH22J=GENES.821 5 


0.038 


300254 AW079607 


Hs.188417 ESTs; Weakly similar to ZnT-3 [Rsapiens] 


0.037 


330096 


CH.19_p2gi|601S278 


00)37 


335191 


CH22_FGENES.507_6 


0.O38 


334040 


CH22_FGBJES.322_8 


0.039 


333586 


CH2£J=GENES.204J2 


0.04 


333285 


CH22_FGENES.132 JL 


00)42 


313326 AK388120 


Hs.122329 ESTs 


00)43 


329517 


CH.10_p2gI)3983513 


0.043 


333403 


CH2*_FGENES.144_21 


00)43 


335226 


CH22_FGENES.513 11 


00)44 


335976 


CH2£J=GENES.652J1 


00)45 


333637 


CH22_FGENES229 _Z 


0.046 


334582 


CH22_FGBiES407 5 


00)46 


336437 


CH22_FGENES£26 4 


O0W7 


337461 


CH22.FGENES.782-1 


00)47 


302892 N58545 


Hs.6975 histone deacetylase 3 


0.046 


338689 


CH2^EMAC005500.GENSCAN.475-3 


0j049 


334721 


CH22^FGENES421_32 


00)49 


305867 AA864572 


EST singleton (not In UnlGene) with exon hit 


00)49 


335498 


CH22_FGENES£71_7 


0.05 


311596 AJ682083 


Hs.223368 ESTs 


00)5 


326959 


CR21JisgI|6469836 


00)51 


311688 AW025661 


K&240090 ESTs 


0.052 


317298 AI922374 


Hs.158549 ESTs 


00)52 


332984 


CH22_FGBJES54_6 


00)52 


321039 AW247083 


EST cluster (not In UnlGene) 


00)53 


335B44 


CH22_FGENES.623j4 


0.053 


325371 


CH.1£_hsgi]5866920 


0.054 


335667 


CH22_FGENES.590_18 


0.054 


333635 


CH22_FGENES22B_2 


0.054 


336736 


CH22LFGBIES.1 10-2 


0.055 


335893 


CH22J=GBJES.635J 


00)55 


333170 


CH22.FGENES.94 5 


00)55 


329768 


CH.14_p2gl|6015501 


00)55 


334030 


CH22LFGENES.320_2 


00)55 


323359 AA234172 


Hs.137418 ESTs 


00)55 


300453 AW051431 


Hs.1 13029 ribosomal protein S25 


00)55 


334262 


CH2aj=GBJES^67_12 


00)55 


306590 AI000246 


EST singleton (not in UnlGene) with exon ha 


O055 


331087 R22520 


Hs23398 ESTS 


00)55 


338620 


CH22_B1AC005500.GENSCAN^50-18 


00)56 


339045 


CH22_DA59H1B.GENSCAN^ 


00)56 


308023 AI452732 


EST singleton (not In UnlGene) wffli exon hft 


O057 
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339067 CH22_DA59H18.GENSCAN.33-3 0.057 

335689 CH22_FGENES.596_4 0.057 

339069 CH22_DA59H18.GENSCAN.33-5 0.057 

338176 CH22_EM^C005500.GENSCAM21W 0.057 

5 328159 CHX6_hsgi|5868065 0.058 

335655 CH22_FGENES.590_6 0X58 

336371 CH22J=GENES£20J 0X58 

336558 C822_FGENES.842_3 0X59 

337738 CH22_B^ AC000097.GBISCAN.1 004 0X59 

10 334273 CH22LFGENES.369 _2 0X59 

335889 CH22_FGENES.633_3 0.059 

327807 CHX5_hsgi|5867968 0.059 

333315 CH22J=GENES.138_7 0X59 

338825 CH22LW246D7.GENSCAN.4* 0X6 

15 337612 CH22LC20H12.GENSCAhL22-5 0X6 

333897 CH22_FGENES293_4 0.06 

335990 CH22_FGENES.655_4 0X6 

334264 CH22.FGENES.367 15 0.06 

338653 CH22_EM4C00550aGENSCAN.46CK39 0X61 

20 322303 W07459 EST cluster (not In UnlGene) 0X61 

333498 CH22LFGENES.163_8 0X61 

336522 CH22_FGBsIES^39_3 0X61 - 

301357 AW295677 Hs.137840 ESTs; Moderately similar to HOMEOBOX 

PROTEIN SIX1 [Haptens] 0X62 

25 305917 AA876469 Hs.181357 laminin receptor 1 (67kD; nbosomal protein SA) 0X62 

338143 CH22_FGENES.705_5 0.063 

333493 CH22^FGBJES.168 J. 0.063 

332533 M99487 Hs.1915 folate hydrolase (prostate-specific membrana antigen) 1 0X63 

325844 CH.16Jisgi|6552453 0X63 

30 336402 CH22_.FGENES.823 17 0.063 

335767 CH22_FGENES.607J 0.064 

301893 T80334 EST duster (not in UniGene) with exon hit 0X64 

324019 AW177009 EST duster (not in UniGene) 0.064 

305801 AA845997 EST singleton (not in UniGene) wflh exon hit 0X64 

35 335188 CH22_FGENES.507._3 0,065 

337533 CH22_FGENES.828-2 0.065 

333311 CH22JH3ENES.138J3 0.065 

335668 CH22_FGENES£90_19 0.065 

306786 A1041589 EST singleton (not in UniGene) with exon hit 0.066 

40 306365 AA962086 EST singleton (not In UniGene) with exon hit 0.068 

306249 AA933840 EST singleton (not in UniGene) wflh exon hit 0.066 

335018 CH22_FGENES.474 6 0X66 

333594 CH22_FGENES.210_3 0.068 

333900 CH2_J=GENESiS3 - 7 0.066 

45 325207 CH.10JIS #552430 0.067 

329888 Crt15_p2gQ6067149 a067 

328238 CR17Jisgij5B67260 0.067 

333658 CH22_FGENES241_4 0X67 

335809 CH2LFGBIESX17_6 0X68 

50 307427 AK43437 EST singleton (not in UniGene) with exon hft 0X68 

318428 AI949409 Hs524583 ESTs 0.069 

327005 CH21_hsgl[5867664 0X69 

330463 HG998-HT998 Sulfotransferase, Phenol-Prefemng * 0X69 

333318 CH22_FGENES.138J0 0X7 

55 333313 CH22__FGENES.138J5 0X7 

325937 CH.16_hsgi|5867132 0.07 

335663 CH22J=GB*ESiB0J4 0X7 

335349 CH22_FGENESi39J 0X7 

303396 AA224470 Hs 25426 ESTs; WeaMy simflar to unknown [Usapiens] 0X7 

60 332603 N66681 Hs.33470 ESTs 0.07 

333310 CH2^.FGENES.138_2 0X71 

309924 AW340812 EST singleton (not in UniGene) with exon tut 0X71 

336340 CH22J=GBIESJ14J5 0X71 

308025 AI453365 Hs. 172928 collagen; type I; alpha 1 0X71 

65 306805 AIQ55968 EST singteton (not in UniGene) with exon hit 0X71 

335499 CH22_FGENES571JB 0X71 

329669 Cai4_p2gi|6272129 0X71 

321666 D28390 EST duster (not h UniGene) 0X71 

338174 CH22_EMAC005500.GB^SCAN219^ 0X72 
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10 



15 



20 



25 



30 



305451 AA738105 Ha. 140 



Hs.174007 

H&226223 

H&88504 

Hs.128141 
H&251577 
Hs.124860 



333214 

331917 AA446572 

339102 

328122 

332250 N62712 
328506 

331758 AA291468 
335193 

317729 AA97171B 
304515 AA458708 
313644 AI565766 
826145 



306516 AA989542 
300629 AA152119 

333160 
337490 

305403 AA723748 
331747 AA281765 



330513 M81057 
308905 A1859636 
337419 
333459 
334851 
329046 
327879 

305830 AA857665 
35 302928 AL137719 
304321 AA136698 
326390 
335230 



40 335331 



304753 AA57B840 

301863 AI418883 
336561 
335611 

305060 AA635771 

306051 AA905130 

308289 A1571211 



45 



50 332634 S38953 



55 



334758 

309641 AW194230 



60 328304 



331809 AA402482 
326138 



330570 U6Q276 
334305 



65 333531 



Hs.183689 

Hs.180884 
H&8102 



330385 AA449749 Hs£1388 

323305 AA811351 Hs25307 
H&65843 



CH22LFGENES.842L1 0.072 
Imnumogk&ulln gamma 3 (Gm marker) 0.072 
CH22_FGENES.46-1 0.072 
CH21JlsgI]6004446 0.073 
CH22_FGENES.3Q3_1 0X74 
CH22_FGENES.104_5 0.074 
ESTs; Moderately Mar to IIU ALU SUBFAMILY J WARNING 0X74 

CH22J)A59H18.GENSGAN,44-9 0X74 

CH.06_hs gI|5B68031 0X75 

WAA0618 gene product 0.075 

CHX7_hsg!l5868471 0X75 

ESTs 0.075 

CH22_FGENES507_8 0X76 

ESTs 0X78 

hemoglobin; alpha 2 0.076 

ESTs 0X76 

CH.17Jisgi|5867204 0.076 

CH22LPGENES£23_6 0.077 

EST singleton (not In UniGene) wfth exon Wt 0X77 
Hs.155101 ATP synthase; H+ transporting; rnftochondnal F1 complex; alpha subunB; 

isoform 1; cardiac muscle 0X77 

CH23_.FGENES.91_2 0.077 

CH22_FGENES.799-5 0X77 ~ 

EST singleton (not m UniGene) with exon hft 0X77 

ESTs 0.077 

CH22.FGENES.3J2 0X78 

carboxypepfidass B1 (tissue) 0.078 

ribosomal protein S20 0.078 

CH2^FGENES.7594 0.078 

CH22J=GENES.157_8 0.078 

CH22_FGENES.440J 0.078 

CKXJisgi|5868569 0X78 

CHX6_hs gi[5868142 0.079 

EST singleton (not in UniGene) with exon hit 0.079 

EST cluster (not in UniGene) with exon hit 0.079 

ribosomal protein S25 0.079 

CH.19_hsgl|5867340 0X79 

CH22J=GENES£14_2 0X8 

CH22.FGENES.412 6 0X8 

CH22J=GENESJ35j4 0X8 

major Wstocompafibffity complex; class I; B 0X8 

EST cluster (not in UniGene) with exon hft 0X81 

CH22.FGENES.842J5 0X81 

CH22_FGENES<583_5 0.081 

EST singleton (not In UniGene) with exon hit 0.081 

EST singleton (not in UniGene) with exon hit 0X82 

EST singleton (not in UniGene) with exon hit 0.082 

CH2^FGENESJ378_13 0X82 

CH22_FGENESi71^4 0X82 
Human unidentified gene complementary to P450c21 

gene; partial cds a 082 

CH2*_EM:ACOQ5500.GENSCAN.13-18 0.082 

CH22.FGENES.61 9_7 * 0.082 

CH2^FGENES.428_7 0.082 

EST 0.082 

CH22_FGENES.75_7 0X83 

CH2Z_Bvt:AC005500.GENSCAN.477-25 0X83 

ESTs 0X83 

CH.17Jsgtj5867203 0.083 

CHX7_hsgfj6004478 0X83 

arsA (bacterial) arsenfte transporter; ATP-binding; homoiog 1 0X83 

CH2S_FGB!ES-373JB 0X83 

C822_FGENES,632J3 0X83 

CH.16Jisgi|6552452 0X83 

CH22LFGENES.175J8 0X84 
ESTs; Highly similar to secreted apoptosis related protein 

1 [asapfens] 0X84 

Homo sapiens dona 24812 mRNA sequence 0X84 

ESTs 0X84 



Hs.113029 



Hs.77981 



HS.2531Q0 

HSX7312 

Hs.165439 
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335888 CH23.FGB1ES.633J2 0.084 

306008 AAS94390 EST singleton (not In UniGene) with exon hit 0.084 

334249 CH22J=GENES.385_15 0.084 

318303 AW451197 Hs.113418 ESTs 0.084 

5 330171 CH.02j>2 gi|6648220 0.084 

336662 CH22_FGENES.41-1 0.085 

320506 AI815668 Hs. 157478 sud -associated neurotrophic factor targat 2 

(FGFR staffing adaptor) 0.085 

316974 A1740721 Hs. 128202 ESTs 0.085 

10 336492 CH22_FGENES.832J 0.085 

335750 CH22_FGENES£Q2_4 0.085 

335678 CH22.FGENES.594J 0.086 

336093 CH22J=GENES£91J 0.086 

310932 AI933861 H&222852 ESTs 0.088 

15 335160 CH22_FGENES^02_4 0.086 

334306 CH23J=GENES.373_9 0.086 

334793 CH2*J=GB<ES.433J5 0.088 

333938 CH2^JGBIES^01J2 0.087 

336413 CK22LFGENES^23J35 0.087 

20 333775 CH2U=GENESJ272JB a087 

335971 CH22_FGENES£52 - 4 0.087 

301737 AI615981 EST duster (not in UniQene) with exon hit 0.087 

339101 CH22JJA59H18.GENSCAN.44-6 0.087 

327612 CR04Jlsgi|65252B3 0.087 

25 326241 CH.17jts 01(5867260 0.088 

338386 CH22_EM^C005500.QENSCAN^314 Oj088 

327762 CK05Jtsgq5867961 0.088 

305266 AA679772 EST slngteton (not In UniGene) with exon hit 0.088 

334359 CH22LFGENES.37Bjt , 0.088 

30 335500 CH22J=GENES.571J0 0.088 

329687 Cai4jj2gl|61 17856 0.088 

333654 CH22LFGBJES.240_2 0.088 

324430 AA464018 EST cluster (not in UniGene) 0.088 

325999 CH.16J* g!|5867073 0.089 

35 334832 CH22__RiENES.439J O089 

339115 CK22J3A59H16.GENSCAN.493 O089 

300898 A1916902 H&213882 ESTs 0.089 

328784 CH.07_hs gi|5868309 0.089 

335044 CH22_FGENES.480_1 0.089 

40 329791 CH.14_p2gi]6469354 0.089 

333656 CH2*_FGENES.240_4 O089 

326180 CH.17J»gi|5887211 0.089 

333391 . CH22_FGENES.144_6 0.089 

338324 CH22_EM^C0O5500.G^iSCAN^3 O089 

45 305396 AA721052 EST stngleton (not In UniGene) w3h exon hit 0.089 

337483 CH2?_FGENES.7957 0.09 

325424 Cai9_hsgp67369 0.09 

306454 AA977992 EST singleton (not in UniGene) with exon hit Oj09 

338893 CH22JXJ32J10.GBJSCANJ-6 0.09 

50 327470 CUG2Jisgil5867772 OJ09 

333165 CK2aj=GENES^1.7 O09 

307155 AI186738 Hs.182426 ribosomal protein S2 0j09 

330717 AA233926 H&23S35 ESTs * 0.09 

335334 CH2*J=GENES.535J0 Oj09 

55 335907 CH22_FGENES.638_2 0.09 

333885 CH22J=GENES.292_7 Oj09 

331034 N51868 K&31965 ESTs; Moderately similar to 40S RIBOSOMAL 

PROTON S20 [Haptens] 0.09 

304660 AA534416 Hs.162185 ESTs 0.09 

60 328217 CHX»Jtsgq5888096 0.091 

336068 CR22_FGENES.684_13 0.091 

302833 AA295381 H&44423 ESTs OQ91 

328668 CR07Jisg||5888254 OJ091 

335309 CH22J=GENES^32^ 0091 

65 338481 CH22_B^C005500.GENSCAT1377^ OJ091 

306288 AA936892 EST singleton (not in UniGene) with exon hit Oj091 

305070 AA639783 EST singleton (not in UniGene) with exon ha 0091 

304870 AA594811 Hs.119122 rfoosomal protein L1 3a 0091 

303856 AA968589 H&944 glucose phosphate Isomerase O091 
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323789 AJ 459812 Hs.170460 ESTs; Weakly similar to WAA0990 protein [H.saplens] 0.092 

334910 CH22_FGENES.455_3 0.092 

326382 CH.19_hsgi|5867327 0X392 

332467 AA469630 Hs.1 19004 K1AA0665 gene product 0.092 

5 338534 CH22_EM:ACO05500.GENSCAN.4G2-7 0.092 

336449 CH22JGBIES.829J 0.092 

333709 CH2^FGBIES250JJ4 0X392 

336559 CH22_FGENES.842J 0X392 

333230 CH2^FGENES.107J0 0X393 

10 333133 CH22LFGENES32L9 0.093 

334885 CH2£J=GENES.451_11 0.093 

330605 X02419 Hs.77274 plasminogen activator; urokinase 0.093 

338392 CH22_FGENES.823_4 0.093 

334083 CH22_FGBJES^27_38 0X393 

IS 325469 CH.iajtsgpi7034 0.093 

331077 R09531 Hs.1 9039 ESTs 0X393 

303701 AW500732 EST cluster (not in UniGene) with exon hft 0.093 

334218 CH22_FGENES.358_3 0.093 

338542 CH22J=GENESJ40_6 0.093 

20 337151 CH22J=QENES546-1 0.093 

333642 CH2aJ=GENES531_2 0X393 

336863 CH22_FGENES.297-4 0X393 . 

334680 CH22J=GENES.419_2 0.093 "~ 

326365 CH.18_hsgI)5857297 0.093 

25 338952 CH22JM32110.GENSCAN.23-22 0.093 

337539 CH22JFGENES.832-4 0.094 

333546 CH22J=GENES.180_2 0.094 

335258 CR22JFGENES516J3 0.094 

336786 CH22_FGENES.168-19 0.094 

30 321644 AI204177 Hs£37396 ESTs 0.094 

335943 CH22_FGENES.646_17 0.094 

327918 CHX36Ju>gi|5868165 0X394 

306398 AA97Q548 EST singleton (not in UniGene) with exon hit 0X394 

335671 CH22J=GENES.592_3 0X394 

35 335033 CH22LFQBIES475J1 0X394 

338277 CH22_EfttAC005500.GBISCAN290-2 0X394 

332061 AA504812 Hs.192824 early B-ceO factor 0X394 

305153 AA654582 Hs.77039 ifcosomal protein S3A 0.094 

333880 CH22J=GENES.292_2 0.094 

40 323940 A1864428 Hs.1 70880 ESTs 0.094 

313779 AA648798 Hs.129771 ESTs 0X395 

323109 AA169345 EST duster (not in UniGene) 0X395 

332930 CH22_.FGENES.38j4 0X395 

335368 CH22JH3ENES.543JB 0X395 

45 303887 R72672 Hs.193484 ESTs; Weakly sirroTar to Similarity wfth yeast gene 

L3502.1 [Celegans] 0.095 

336223 CH22_FGENES.727_3 0X395 

311280 AI767957 Hs.1 97737 ESTs; Weakly similar to Y38A8.1 gene product [Celegans] 0.095 

337256 CH2£JGENES.648-3 0.095 

50 308814 AI819263 EST singbton (not In UniGene) with exon hit 0.095 

334659 CH22J=GENES.418_7 0.095 

335B95 CH22LJFGENES.635J3 0.095 

321697 AW388061 Hs.4953 golgl autoanfigen; gotgln subfamily a; 3 - 0.0% 

336010 CH22J=GENES.668_8 0.096 

55 302824 U21260 EST cluster (not tn UniGene) with exon hit 0.096 

333612 CH22JFGENES217J7 0.096 

304823 AA5B4337 EST singleton (not in UniGene) wfih exon hit 0.098 

335665 CH22J=GENES.590J6 0.096 

306518 AA989598 EST singleton (not in UniGene) wfth exon hit 0.096 

60 335243 CH22J=GENES.516J 0.096 

335436 CH22J=GENES.559J 0.096 

300243 AM20256 Ks.161271 ESTs 0.096 

332810 CH22.FGENES.7J 2 0.097 

308612 AI735634 EST singleton (not in UniGene) with exon ha 0.097 

65 335818 CH22_FGENES.618_6 0X397 

325838 CH.16_hsgi]5552452 0.097 

337482 CH22J=GENES.795-6 0X397 

336645 CH22J=GENES.26-1 0X397 

337293 CH2a.FGENES£75-1 0X398 
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329893 


CH.16_p2gq6525313 


0.098 


326533 


CH.19Jisgip867441 


0X98 


334905 


CH2aj=GENES452_20 


0X98 


306347 AA961144 


EST singteton (not in UniGene) with exon ha 


0.096 


336676 


CH22_FG0^ES434 


0X98 


339166 


CH22_DA59H18.GENSCAN.69-7 


0x98 


335774 


CH22-FGENES.607JO 


0X198 


339216 


CH22J=F113D11.GENSCANX-11 


osm 


335311 


CH22J=GENES.532j4 


0.093 


329632 


CH.11JJ2 #729060 


0X98 


328595 


CR07Jisgi|586a224 


0X98 


326928 


Ca21_hsgi]6456782 


0X98 


315234 AI079680 


Hs.120770 ESTs 


0X98 


306082 AA908508 


EST singleton (not In UniGene) with exon hit 


0X98 


305710 AA826544 


EST singleton (not in UniGene) with exon hit 


0X98 


318540 T30280 


EST duster (not In UniGene) 


0X99 


337553 


CH22_C4G1.GENSCAN.2-1 


0X99 


320951 AA344069 


H&202699 neurexophffln 4 


0X99 


303845 T08033 


EST duster (not in UniGene) wffli exon nft 


0X99 


338981 


CH22_OA59Hl8.GENSCAN-!-5 


0.099 


321313 R87365 


Hs26058 ESTs; Weakly similar to p532[H.saplens] 


0.099 


328348 


CH.07 hsgi[5868383 


0.099 


332203 H49388 


Hs.102082 EST 


0X99 


301780 R07064 


EST duster (not In UniGene) wffii exon hft 


0X99 


332095 AA608838 


Hs.162681 EST 


0X99 


333227 


CH22LfGB*ES.107_5 


0X99 


316442 AA760894 


Hs.153023 ESTs 


0X99 


326001 


CH.16 hsgi|5867073 


0X99 


334363 


CH2_3GENES.378 11 


0X99 


338895 


CH22_DJ32i10.GENSCAN.9-2 


0X99 


327460 


CH.02_hsgil5004455 


0X99 


332705 T59161 


Hs.76293 thymosin; beta 10 


0.1 


307806 AI351739 


EST singleton (not In UniGene) with exon hit 


0.1 


322800 F25037 


H&225175 ESTs 


0.1 


304918 AA602697 


EST singteton (not In UniGene) with exon hit 


0.1 


334327 


CH22.FGBtES.375 4 


0.1 


316359 AI097439 


Hs. 135548 ESTs 


0.1 


£6644 


CK20 hsgll5867559 


0.1 


334454 


CH22 FGENES .388 3 


0.1 


327959 


CH.06J*g[|5888210 


0.1 


323783 AA330586 


Hs.131819 ESTs 


0.1 


309198 A1955915 


H&248038 major histocompatibility comptax; class I; C 


0.1 


339265 


CH22_BA354i12.GENSCAN.1M 


0.1 


320578 AL049977 


Hs.1 62209 Homo sapiens mRNA; cDNA DKFZp564C122 






(from done DKFZp564C122) 


0.1 


338132 


CH22_EM^C005500.GENSCAN 200-2 


0.1 


333163 


CH22J=GENES.91 5 


0.101 


337584 


CH22_C20H1 2.GENSCAN.5-1 


0.101 


307588 A1285535 


EST singleton (not In UniGene) wBh exon hit 


0.101 


336969 


CH22J=GENES378-2 


0.101 


327535 


CH.Q2_hsgi|5525279 


0.101 


328732 


CH.07 hSfll|5868289 


0.101 


336686 


CH2Z_FGENES46-3 


0.101 


335777 


CH22J=GENESj607_13 


0.101 


332944 


CH22_FG»JES.47 3 


0.101 


333174 


CH22JGENES.95J 


0.101 


336330 


CH22JFGENES.821J3 


0.101 


330571 U60800 


Hs.79089 sama domain; Immunoglobulin domain (1g); 






cytoplasmic domain; (semaphortn) 40 


0.101 


331789 AA398721 


Hs.186749 ESTs 


0.101 


338915 


CH22_DJ32110.GBISCAN.12-1 


0.101 


334844 


CH22_FGENES>t39_24 


0.101 


336642 


CH22JGENES.234 


0.101 


334906 


CH2a_FGBiES452_21 


0.101 


333188 


CH22_FGBIESS8_8 


0.101 


300088 AW299993 


EST duster (not In UniGene) w3h exon ha 


0.101 


329373 


CRX_hsgi[5682537 


0.102 


331120 R46576 
335856 


Hs23239 ESTs 

CH22FGENES.628L1 


0.102 
0.102 
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331888 AA431337 Hs.98017 ESTs 0.1Q2 

333154 CH22_FGENES3S_4 0.102 

335989 CH2£J=GBIES.655_2 0.102 

304385 AA235S02 EST singleton (not in UnlGone) with exon hit 0.102 

5 338016 CH22_EM^C005500.GENSCAN.133-1 aiQ2 

335190 CH2£/GENES.507J5 0.102 

318595 T39486 Hs.6137 ESTs 0.102 

333897 CH22_FGENES^50J1 0.102 

306526 AA989713 EST singleton (not in UniGene) with exon hit 0.103 

10 328734 CR07_hs gi)5668289 0.103 

307294 A1205612 Hs.73742 ribosomal protein; brge; PO 0.103 

327424 CK02_hs gi|5867751 " 0.103 

335872 CH22J=GENE&630_3 ai03 

333572 CH22LFGENES.189J 0.103 

15 334774 CH2aj=QENES.430_6 0.103 

338660 CH22_EM*C0(B500.GENSCAN.462-1 0.103 

326713 CH2Q_bs g#867595 0.103 

333994 CH22_FGENES.310_18 0.103 

335800 CH2£_FGENES.613 4 0.103 

20 318113 AJ187943 Hs.132322 ESTs 0.103 

337278 CH22JGENE&665-1 0.103 

338388 CH22_FGB4ES.822_6 0.103 

334790 CH22LFGBIES-432_15 0.103 

303778 AW50536B EST duster (not in UniGens) with exon hit 0.104 

25 336524 CH22_FGENES.839J5 0.104 

328936 CH.08_hs gi|5868500 0.104 

335102 CH22_FGBMES.494^7 0.104 

300935 AA513844 Hs.222815 ESTs; Weakly similar to Wiskott-Aldrich Syndrome 

protein [H^apiens] 0.104 

30 307581 AI284415 EST singleton (not in UniGene) with exon hit 0.104 

317301 AW291683 Hs.226056 ESTs 0.104 

335330 CH22J=GENES.535_3 0.104 

337968 Cr^EM^C005500.GENSCAN.103-2 0.104 

335627 CH22_FGENES584_7 a 104 

35 33S274 CH22_FGENES.7B2J2 ai04 

334730 CH22J=GB4ES.424_5 0.105 

334409 CH22_FGENES.383_6 0.105 

327237 CHjOIJis gi!5B87544 0.105 

333321 CH2^FGB£S.138_13 0.105 

40 303161 AA452366 EST duster (not in UniGene) wifli exon hit 0.105 

333738 CH22LFGENE&26L2 0.105 

338255 CH22_EM*C005500.GENSCAN 276-3 0.105 « 

334282 CH22J=GENES.369J2 0.105 

330190 CHJSSJZ gq6165182 0.105 

45 310748 AW014249 Hs.158698 ESTs 0.105 

338150 CH22_ErAACO05500.GENSCAN^07-2 0.105 

336719 CH22_FGENHS.B2-6 0.105 

330228 CH.05_p2gi]6013527 0.105 

327801 CH.05Jisgi|5867824 0.105 

50 330525 S75168 H&J274 megakaryocytB-assocIatad tyrosine kinase 0.105 

334972 CH22_FGENES.468^ 0.105 

335111 CH22J=GENES.494J9 0.106 

334483 CH22_FGBfES^95_5 - 0.108 

328829 CK07Jisgi|5868337 0.106 

55 302753 M74299 EST duster (not in UniGene) with exon hit 0.106 

334512 CH22_FGENES.338J0 0.106 

330024 CH.1B_p2 gIJ6871908 0.106 

321030 A1769930 H$J233617 Homo sapiens (clone B3B3E13) Huntington's 

disease candidate region 0.107 

60 338410 CH22_BA4C00550aGENSCAN.3414 0.107 

334353 CH22J=GBIES.376_5 0.107 

338276 CH22_B^C005500.GENSCAN^8B^ 0.107 

329053 CHJUwgi|5868574 0.107 

336560 CH22_FGENES.B42J5 0.107 

65 332158 AA621363 Hs.112980 EST 0.107 

336447 CH2^FGENES^29_4 0.107 

333703 CH22LFGENES.250J7 0.107 

326207 CH.17_hsgl|5867222 0.107 

333232 CH22_FGENES.108_1 0.107 
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334927 CH22.FGENES.460J 0.115 

330535 U11872 Human intBrieukln-8 receptor type B (ILBRB) mRNA, 

spSce variant ILBRB1 0856 

328591 CH.07JIS 9(5868227 0.115 

5 334902 CH22J=GBJES.452_16 0.115 

328525 CH.07J»sgl)5858482 0.115 

325870 CH.16Jtsgij6682492 0.116 

337522 CH^FGENESmi 0.116 

305079. AA641329 EST singleton (not in UniQene) with exon hit 0.1 16 

10 327343 CHj01Jisgq6017017 0.116 

333918 CH22 J=GENES.296_7 0.116 

333600 CH2*J=GENES.213J2 0.116 

335846 CH22_FGENES.623_6 0.116 

333510 CH22_FGENES.171_4 0.116 

IS 327629 Oi04_hsgil5887872 0.116 

333470 CH2a.FQENEai6L6 0.116 

326855 CH.20_hsgi|6552460 0.116 

327008 OL21_hsgi|5867684 0.117 

337480 CH22JGENES.79W 0.117 

20 338425 CH22_FGENES^24J0 0.117 

321964 AL0796B7 Hs. 171 065 ESTs 0.117 

335651 CH22LFQENES590J2 0.117 

308164 AI521574 H&181 165 eukaryotic translation elongation factor 1 alpha 1 0.117 * 

337827 CH2^EM^C005500.GBISCAN^3 0.117 

25 300341 H45095 Hs. 153524 ESTs 0.117 

300154 AI245127 Hs.179331 ESTs 0.117 

306285 AA937331 EST singleton (not In UniGene) with exon hit 0.117 

329670 CH.14_p2g]|6272128 0.117 

335612 CH22_FGENES.583_6 0.117 

30 307645 AI363450 EST singleton (not In UniGene) with exon hit 0.117 

330401 028383 Human mRNA for ATP synthase B chain, 5UTR (sequence from the 

5'cap to the start codon) 0.117 

327127 CH^1_hsgll6682520 0.117 

333843 CH2aJGENES290_1 0.117 

35 331083 R17762 H&22292 ESTs 0.117 

329140 CHJLhsgpi7060 0.117 

339338 CH223A354U2.GENSCAN£7-3 0.117 

331074 AA464518 Hs.B9616 ESTs 0.117 

338631 CH2£_EM:ACOa55O0.GENSCAN 454-2 0.117 

40 330299 Ca06_p2gj)2905881 0.117 

330351 CH.09_p2 gi|3056622 0.117 

305377 AA715714 Hs.181357 bminln receptor 1 (67WD; ribosorna! protein SA) 0.117 

333106 CH22_FGENES.79J2 0.117 

338514 CH22_EM"AC005500.GENSCAN^924 0.117 

45 327335 CH.0Usg!j5902477 0.117 

301970 AB028962 Hs.120245 KIAA1039 protein 0.118 

326339 Cri17Jsgi]6056311 0.118 

330612 X15673 HsJ3174 Human endogenous retrovirus pHE1 (ERV9) 0.118 

334178 CH23JGENES350J 0.118 

50 328008 CH.08_hsgp02482 0.118 

329976 CR16_p2 gi|4878063 at 18 

320952 AA897432 H 3.1 30411 ESTs ai18 

305621 AA789095 EST singleton (not in UniGene) with exon hit - 0.118 

337850 CH22_EKtAC005500.GENSCANJ4-3 0.118 

55 333626 CH22_FGENES224_2 0.118 

337672 CH22^\C000097£ENSCAN.67-1 0.118 

328803 CH.07_hs gi]6004475 0.118 

325922 CH.16_hsgi|5867122 0.118 

334489 CH22 J=GENES.397J 0.118 

60 320638 R54768 Hs.101120 ESTs 0.118 

321932 AA569229 EST duster (not in UniGene) 0.118 

336958 CH22.FGENES.367-1 0.118 

332082 AA600176 Hs.1 12345 ESTs 0.118 

306004 AA889992 EST singleton (not In UniGene) with exon hit 0.116 

65 338803 CH22J=GENES.194-1 0.118 

309107 AI925823 EST singleton (not tn UniGene) w3h exon hit 0.118 

336859 CH22_FGB^ES.283-9 0.118 

337935 • CH22_BfcAC005500.GBISCAN^6 0.118 

326492 CH.19Jsgi|5867422 0.118 



239 



WO 02/30268 



PCT/US01/32045 



327289 CH.01Jisgi)5867481 0.119 

325818 CH.14_hsgi)6682490 0.119 

310787 AW262580 Hs.159040 ESTs 0.119 

330028 CH.16j>2gij6671808 0.119 

5 325317 CH.11_hsgi(5866878 0.119 

335279 CH22.FGENES.523J 0.119 

331720 AA192173 Hs£21530 ESTs 0.119 

329188 CRXJlSgi)5B68711 0.119 

316012 AA764950 Hs.119898 ESTs 0.119 

10 338316 CH22_E\tAC005500.QENSCAN^04-2 0.119 

326033 CM7JisgiI5867178 0.119 

334745 CH22_FGENES.426_3 0.119 

333051 CH22_FGENES.73_5 0.119 

301763 R01279 EST cluster (not rn UnlGene) with axon hit 0.12 

IS 304502 AA454809 Hs.172828 cofiagen; type I; alpha 1 0.12 

335680 CH22_FGENES.594_5 0.12 

304678 AA548558 EST smglalon (not in UnlGene) wflh axon hit 0.12 

335441 CH22J=GENES.560_4 0.12 

336187 CH22_FGENES.717_11 0.12 

20 309422 AW087175 EST singleton (not In UnlGene) with axon hit 0.12 

336047 CH22_FGENES.679_9 0.12 

309651 A W1 95850 EST singlaton (not In UnlGene) wflh axon hit 0.12 _ 

308547 A1695385 H&201903 EST 0.12 

304443 AA399444 EST singleton (not In UnlGene) with axon hit 0.12 

25 336245 CH22J=GENES.746_3 0.12 

302703 H72333 EST cluster (not In UniGena) with axon hit 0.12 

335690 CH22_F6ENES.596J> 0.12 

328941 CH.08_hsgil6456765 0.12 

333873 CH22J=GBJES.291_9 0.12 

30 317248 AW105092 Hs.155690 ESTs 0.12 

339288 CH23_BA354l12.GENSCAN.16-6 0.12 

337996 CH22_EMJ\C005500.GENSCAN.116-3 0.12 

333304 CH22J=GENES.137J 0.121 

308332 AI591235 EST singlaton (not In UniGena) with exon hit 0.121 

35 329319 CRX_hsgi|6381978 0.121 

302086 X57138 multiple UniGena matches 0.121 

333290 CH22.fGBIES.129J 0.121 

323825 AI793080 Hs.123525 ESTs; Weakly similar to NEUTROPHIL GELATINASE-ASSOCtATED 

UPOCAUN PRECURSOR [anorvegjcus] 0.121 

40 330575 U64105 HsiJ52280 Rho guanine nucleotide exchange factor (GEF) 1 0.121 

305274 AA679990 Hs.181 165 eufeuyofc translation elongation factor 1 alpha 1 0.121 

333647 CH22J=GENES235_2 0.121 

302251 AA333340 EST cluster (not In UniGena) with exon hit 0.121 

329777 CH.14_p2gPQ2090 0.121 

45 333155 CH22J=GENES*9_5 0.121 

326122 CH.17.hsg?5867194 0.121 

335310 CH22_FGENES£32.3 0.121 

335453 CH22_FGBIESi62_13 0.122 

305103 AAS43329 Hs.111334 ferritin; BghtpoJypepftta 0.122 

50 337284 CH22.FGENES.667-2 0.122 

337418 CH22_FGENES.758-4 0.122 

313073 A1963740 Ks.46826 ESTs 0.122 

303759 AW504164 EST duster (not In UnlGene) with exon hit * 0.122 

300017 

55 M33197 AFFX control: GAPDH 0.122 

316725 AW135084 Hs.127264 ESTs 0.122 

330738 AA293153 Hs.120980 nuclear receptor co-repressor 2 0.122 

336466 CH22_FGENES.fi29_25 0.122 

335956 CH22_FGENES.647_3 0.122 

60 315308 AA780564 Hs.189053 ESTs 0.122 

338925 CH22JXi32i10.GENSCAN.14-3 0.122 

334969 CH22_FGB€S.46SJ 0.122 

322050 AL137589 EST duster (not in UniGene) 0.122 

339084 CH22_DA59H18.GB4SCAN.38-2 0.122 

65 338323 CH22_EM^COO55O0.GENSCAN^06-2 0.122 

337003 CHS^FGENES^I 9-7 0.122 

325470 CR12_hsgl|6017Q34 0.123 

338503 CH22JFGENESJ33J0 0.123 

330786 D60374 HS258712 EST 0.123 
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329448 CH.YJ»gi|5868886 0.123 

303326 AA229433 H&222634 ESTs; Moderately sfrnSar to ubfquifin-fike protein/ 

ribosomal protein 830 0.123 

309067 AI916313 Hs.212788 EST 0.123 

5 317464 AA968472 Hs, 130463 ESTs * 0.123 

328755 Ca07_hsgi|5868301 0.123 

326038 CH.17Jisgi|5867178 0.123 

327208 CH01Jisgi|5867447 0.123 

326124 CR17_hsgiJ5916395 0.123 

10 327509 CH.02Jwgl]6117B15 0.123 

338398 CH22.EM^C005500.GENSCAN.33&6 0.123 

304652 AA527782 H&84298 CD74 antigen (invariant polypeptide of major 

histocornpatfoffity complex; dass II antigen-associated) 0.123 

335797 CH2?J=GENES.612L6 0.1*4 

IS 336714 CH2aj=GENES.76-29 0.124 

327204 CH.01J«gij5867447 0.124 

331881 AA430672 Hs.123778 ESTs 0.124 

306971 AI126509 EST singleton (not In UniGene) wfih exon hit 0.124 

336174 CH2e.FGENES.710J 0.124 

20 336126 CH2a_R3ENES.701_13 0.124 

329129 CHXJlsgi|6586026 0.124 

303049 AW407562 EST duster (not tn UniGene) wfih exon hit 0.124 

335778 CH23JH3ENESJ607J4 0.124 * 

336601 ' CH22JFQmES2S5_Z 0.124 

25 334340 CH23J=GBIES.375J7 0.124 

337438 CH22J : GENES.787-1 0.124 

306013 AA896990 EST singleton (not in UniGene) with exon hft 0.124 

339213 CH22_FF1 1301 1 .GENSCAN.6-8 0.124 

335355 CH22.FGENES.541JJ 0.124 

30 336552 CH22J=GENES.841_9 0.124 

336384 CH2?_FGENES.822_4 0.124 
310485 AI286202 Hs.149800 ESTs 0.125 
335840 CH22_FGENES.622_3 0.125 
336444 CH22_FGENES£27 10 0.125 

35 315703 N36070 EST cluster (not tn UniGene) 0.125 

327763 CH.05_hsg!|5867961 0.125 

336383 CH23J=GENESJ82?J3 0.125 

333498 CH22J=GENES.168.6 0.125 

328662 Ca07_hsgi]6004473 0.125 

40 338988 CH22J2A59H18.GENSGAM5-1 0.125 

328311 CU07Jtsgt|5868371 0.125 

337241 CH22J=GENES£44-2 0.125 

336933 CH23JH3ENES.350-7 0.125 

313483 AW294432 Hs.144252 ESTs 0.125 

45 326116 CH.17.hsgi]5867193 0.125 

330450 HG363-HT383 Epidermal Growth Factor Receptor-Related Protein 0.125 

307491 A1268539 EST singleton (not in UniGene) with exon hft 0.125 

• 331852 AA4189B8 Hs38314 Homo sapiens mRNA; cONADKFZp586L0120 

(from done DKFZp586L0120) 0.125 

50 330462 HG944-HT944 Dopamine Receptor D4 0.125 

304410 AA284508 EST shgteton (not in UniGene) with exon hit 0.125 

336385 CH22_FGBJES.822_5 0.125 
336793 CH22.FGENES.176-3 - 0.125 
326243 CH.17J1S 9^5867261 0.125 

55 327266 CK01Jtsgi|5867462 0.125 

320753 AF070579 Hs.181544 Homo sapiens done 24487 mRNA sequence 0.125 

336960 CH22J=GB€S.369-5 0.125 

329667 CH.14_p2gi|6272129 0.125 

328168 CH06_hsgfJ5868O71 0.125 

60 336534 CH22.FGENES.839J6 0.125 

339289 CH22_BA354l12.GENSCAN.16-9 0.126 

309230 A1970747 EST singleton (not in UniGene) with exon hit 0.126 

339190 CH22JT113D11.GENSCAN.1-2 0.126 

337088 CH22J=GENES.45&-14 0.126 

65 319233 R21054 Hs2 11522 ESTs 0.126 

339396 CH22_BA232E17.GENSCAN.6-8 0.126 

331930 AA449077 Hs. 179765 Homo sapiens mRNA; cDNA DKFZp586H1921 

(from done DKFZp586H192 0.12$ 

308099 AI475914 EST singleton (not in UniGene) with axon ha 0.126 
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33847 CH22_EM^C005500.GBJSCAN^73-5 0.126 

334286 CH22_FGENES.369J6 0.126 

317245 AI025039 Hs.131732 ESTs 0.126 

335249 CH22_FGENESi16_10 0.126 

5 333327 CH22_FGENES.133_20 0.126 

304240 AA009802 EST singtetoo (not In UniQene) with exon hft 0.126 

335464 CH22LFGENES562^6 0.126 

335236 CH22_FGENES.515_8 0.126 

334154 CH22J=GENES.340_4 0.126 

10 309257 AI984183 EST singleton (not in UniGene) with exon hit 0.126 

310015 AI220122 H&201981 ESTs; Weakly similar to breast carcmoma-associated antigen 

[Ksaptens] 0.126 

328280 CH^7Jisgp68352 0.126 

305744 MB31819 EST singleton (not in UniQene) with exon h& 0.126 

15 327430 CK02JtsgP67754 0.126 

328323 CH.07_hs gl|5868373 ai26 

333274 CH22LFGEMES.123J2 0.126 

337193 CH22_FGENES£75-3 0.127 

334820 CH22_FGENES.437_2 0.127 

20 328706 CHXJ7_hsgij5868270 0.127 

331228 W67267 Hs.174911 ESTs 0.127 

307205 A1 192479 EST stngteton (not in UniQene) with exon hit 0.127 

337123 CH22J=GENES£19-3 0.127 A 

3262)1 CH.17JiSflip867216 0.127 

25 335276 CH22J=QBIES^23J2 0.127 

331202 T81115 Hs.191138 ESTs 0.127 

330532 U03187 Hs.121544 interteukin 12 receptor; beta 1 0.127 

321235 N49521 EST duster (not in UniGene) 0.127 

301743 F12605 Hs.204529 ESTs; WeaWy similar to reverse transcriptase [Helens] 0.127 

30 328175 CH.06JIS gJJ5B68Q73 0.127 

306407 AA971985 EST singleton (not In UniGene) with exon hit 0.127 

327145 CH.01Jtsgij5BQ7548 0.127 

327649 CH.04_hs gi|5B67899 0.127 

335142 CH22J=GENES.493_12 0.127 

35 333905 CH22_FGENES.285_2 0.127 

330608 X04325 Hs.2679 gap junction protein; beta 1;32kD(connexin 32; 

Charcot-Marie-Tooth neuropathy; X-Enked) 0.127 

330158 CR21_p2gl|658Q367 0.127 

320153 AF064594 Hs.120360 phosphoCpase A2; group VI 0.127 

40 314407 AA098835 H&224432 ESTs 0.127 

333383 CH22_JGBiES.143_22 0.127 

320663 AI734242 Hs244473 ESTs 0.128 

326233 CH.17_hsgi|5867232 0.128 

326598 CR20J*sgt|5867634 0.128 

45 335174 CH22J=GENESi04„4 0.128 

319843 H29920 Hs.99486 ESTs; WeaJdy similar to aralarl [H.sapiensJ 0.128 

335458 CH22_FGENES562J8 0.128 

332997 CH2^FQENES58_4 0.128 

334188 CH22_FGENES.352_3 0.128 

50 329759 CH.14JJ2 gi|6048280 0.128 

330348 CH.09js2gij4544475 0.128 

326958 CR21_hs gi|6469836 0.128 

305263 AA679467 EST singleton (not In UniGene) with exon hit * 0.126 

337693 CH22.EMAC000097.QENSCAN.78-14 0.128 

55 326812 CR20.hsgi|6682504 0.128 

333237 CH22J=GENES.108_7 0.128 

333699 CH22J=GENES.250_13 0.128 

311498 AI768677 Hs.209888 ESTs; Weakly similar to phosphatidytserine 

synthase-2 [M.muscukjs] 0.128 

60 336499 CH22_.FGENES.833 4 0.128 

320087 AF032387 Hs.1 13265 small nuclear RNA activating complex; polypeptide 4; 1 90kD 0.128 

309989 AI184186 Hs.197813 ESTs 0.128 

301490 AW288468 HS250461 ESTs 0.128 

337011 CH22J=GENES.427-6 0.128 

65 315052 AA876910 Hs.134427 ESTs 0.128 

301611 W22172 Hs£9038 ESTs 0.128 

336497 CH22_FGENES333 _2 0.129 

302068 Y16280 Hs.1 32049 endoMn type b receptor-Eke protein 2 0.129 

334502 CH22_FGENES.397J8 0.129 



242 



WO 02/30268 



PCTAJS01/32045 



304332 AA158884 EST singleton (not In UniGene) with exon h& 0.129 

304522 AA465405 EST singleton (not in UniGene) wflh exon hit 0.129 

312407 R46180 Hs.153485 ESTs 0.129 

310098 A1685641 Hs.161354 ESTs 0.129 

5 301119 AF142579 EST duster (not in UniGene) with exon hit 0.129 

309268 AI985821 H&62954 feirifln; heavy poiypeptioel 0.129 

330989 H42142 H&226396 DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 19 

(Dbp5; yeast; homotog) 0.129 

336949 CH22_FGENES.381-4 0.129 

10 330115 CH.19_p2gi|6015202 ai29 

339212 CH2aj=F113D11.G9JSCAN^7 0.129 

326951 CR21_hsgi|6004448 0.129 

305165 AA662939 EST singleton (not In UniGene) with exon hit 0.129 

308238 AI559492 EST singleton (not in UniGene) with exon hft 0.129 

IS 337140 CH2Z_FGENES.537-5 0.13 

321758 U29112 EST duster (not in UniGene) 0.13 

304619 AA515554 Ks.119598 ribosomal protein L3 0.13 

312469 AA745289 Hs, 173088 ESTs 0.13 

339017 CH22J3A59H18.GENSCAN^6 0.13 

20 330116 CR19j)2gi|60152Q2 0.13 

333312 • CH22_FGENES.138j4 0.13 

338004 CH22_EM:AC0G5500.GEN$CAN.121-1 0.13 

314141 AA232134 Hs.190028 ESTs 0.13 

300509 AI239845 Hs. 128494 ESTs; Weaidy simlar to EG35B7.2 [Djmlanogaster] 0.13 

25 338530 CH22_ENtAC005500.GBI SCAN ^98-1 1 0.13 

335968 CH22JGENES.652J 0.13 

314121 AI732100 Hs.187619 ESTs 0.13 

337593 CH23_C20H12.GB4SCAN.64 0.13 

332881 CH22_F0ENES^3 1 0.13 

30 305836 AA856043 EST singleton (not in UniGene) with exon hit -> 0.13 

339059 CH22 _DA59H18.GENSCAN.30-5 0.13 

305610 AA782319 EST singleton (not in UniGene) with exon hit 0.13 

305852 AA862455 EST singleton (not in UniGene) wflh exon hit 0.13 

327409 Oi02_hsgij5867750 0.13 

35 312751 AI613089 Hs.164178 ESTs 0.13 

308726 AI7S9268 H&209929 EST 0.13 

325961 Cai8_hsgi]5867147 0.13 

311159 AW025919 Hs.197636 ESTs 0.13 

322715 AA057230 Hs.182135 ESTs 0.13 

40 336441 CH22J=GENES.827_7 0.13 

336339 CH22J=GENES.814 12 ai3 

306911 AI095365 EST singleton (not InUnJGene) with exon hit 0.13 

333613 CH22_FGENES.217 8 0.13 

338489 CH22_B^ AC005500.GENSCAN^84-1 7 0.131 

45 326904 CR21Jis gi|5867684 0.131 

337337 CH2S.FGENESJ17-1 0.131 

326752 CR20Jisgi|5867615 0.131 

303977 AW512978 EST singleton (not in UniGene) wBh exon hft 0.131 

301373 AA595235 EST duster (not in UniGene) with exon hit 0.131 

50 338448 CH22_EM^C005500.GENSCAN.359-22 0.131 

333774 CH22j:GENES272_5 0.131 

332986 CH22.FGENES.54_8 0.131 



335362 CH22_FGENES£41J2 * 0.131 

335896 CH22_FGENES.635 4 0.131 

55 337825 CH22_EM^C005500.GENSCAN.13-19 0.131 



£5257 CR11_hsgi|5866895 0.131 

331188 T50240 Hs.167837 ESTs 0.131 

330645 Y08302 Hs. 144879 dual specificity phosphatase 9 ai31 

331760 AA292721 Hs.154434 ESTs; Weakly similar to unknown [Rsaplens] 0.131 

60 322995 AA513829 H&29797 ribosomal protein L10 0.131 

335497 CH22J=GENES.571J5 0.131 

334824 CH2*J=GENES437_6 0.131 

319480 R06933 Hs.164221 ESTs 0.131 

334842 CH2£J=Ge£S.439J1 0.131 

65 333335 CH22J=GB4ES.139J 0.131 

317252 AA905178 Hs.130124 ESTs ai31 

329034 CHJUlsgil5868561 0.131 

305186 AAS64230 EST singleton (riot in UniGene) with exon hft 0.131 

335755 CH22J=GENES.604_4 ai31 
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302143 H 15270 Hs.1 89847 putafive neuronal ceG adhesion molecule 0.131 

334939 CH22_FGENES.465_3 0.131 

318994 C15110 Hs.17802 ESTs 0.131 

334498 CH22J=GENES597_14 0.131 

5 333413 CH22J=GENES.148_2 0.132 

329876 CH.14_p2g!|8272128 0.132 

327277 CH.01_hs g!J5B67473 0.132 

305022 AA627416 EST singleton (not In UniGone) with exon nil 0.132 

336805 CH22_FGENES.196-3 0.132 

10 320121 T93657 EST duster (not in UniGene) 0.132 

334781 CH22_FGENES.428_10 0.132 

339400 CH22_BA232E17.GBISCAN.7-6 a 132 

330301 CH.06j)2 gi)2905862 0.132 

316822 AA827691 Hs.129967 ESTs; Weakly similar to neuronal thread protein 

15 AD70-NTP [HUaptens] 0.132 

328020 CR06Jisgi|5902482 0.132 

325327 CH.11_hsgi|5866875 0.132 

321 163 AA209530 EST cluster (not In UnlGene) ai32 

336393 CH22_FGENES.823_5 0.132 

20 325905 Cai6JisgI|5867104 0.132 

305237 AA676286 HsU186 eukajyofc translation etongation factor 1 gamma 0.132 

339046 CK2£J)A59Hie.GENSCAN.2fr6 0.132 

325375 CH.12_hsgJl5866920 0.132 * 

333961 CH22_FGENES.304_7 0.132 

25 335450 CK22_FGENES ,562.8 0.133 

302286 R58438 EST duster (not In UniGene) with exon hit 0.133 

335116 CK22J=GENES.496_3 0.133 

327333 CH.01Jlsg(|59Q2477 0.133 

308070 AI470948 EST singleton (not In UnlGene) with exon hit 0.133 

30 308311 AI581855 EST singleton (not In UniGene) with exon hit 0.133 

320313 AW360847 H&208839 ESTs 0.133 

323665 AW248307 EST duster (not In UnlGene) 0.133 

328318 CH.07_hsglj5868373 0.133 

320603 R51419 EST duster (not In UnlGene) 0.133 

35 332791 CH22.FGENES3J 0.133 

314976 AA524725 Hs.162108 ESTs 0.133 

303309 AL134164 H&224868 ESTs 0.133 

320581 R39753 Hs.170187 ESTs 0.133 

333944 CH2^.FGENESJ0^ 0.133 

40 317992 AI733512 Hs.130901 ESTs 0.133 

330935 F02383 H&26492 beta-1^liiciJror^transfenise 3 (glucuronosyitransferase I) 0.133 

336659 CH22_FGB(ES36-5 0.133 

338887 CK2LPJ32110.GENSCAN.6-10 0.133 

305273 AA679979 Hs.1 81 165 eukaryotic translation elongation factor 1 alpha 1 0.133 

45 333566 CK2£JGENES.183_2 0.134 

316952 AW450033 Ks.163312 ESTs 0.134 

333818 CH22_FGENES.283J 0.134 

328687 CR07Jlsgp58262 0.134 

302879 H1 1802 EST duster (not In UnlGene) wlfo exon hit 0.134 

50 336557 CH22LFGENES.843J2 0.134 

335222 C^FGENES513_5 0.134 

338094 CH22_Bw1AC005500.GENSCAN.17W 0.134 

337384 CH22.FGENES.745-1 * 0,134 

327360 Cri01.hsgl|6552411 0.134 

55 328132 CR06Jis gi|5868038 0.134 

323604 AT751438 Hs.182827 ESTs; Weakly similar to !!!] ALU SUBFAMILY SQ 

WARNING ENTRY Un 0.134 

337591 CH22_C20Hl2.GENSCAN.6-6 0.134 

307018 A1140639 EST singleton (not In UniGene) with exon hit 0.134 

60 326896 CH21Jhsgi|5667680 0.134 

333479 CH22_FG94ES.163_5 0.134 

337915 CH22_EMJVC005500.GENSCAN.61-3 0.134 

335110 CH22J=GENES.494J8 0.134 

333481 CH22_FGENES.163_9 0.134 

65 327512 CHj02Jisgi|6117815 0.134 

300098 AW328639 H&B3575 ESTs; Weakly strrto to ZC328.3 [Celegans] 0.134 

330163 CRG2_p2gtj6042042 0.135 

335752 CH22_FGENES.604_1 0.135 

334857 CH22J=GENESM3J 0.135 
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301872 
337529 
335734 
337551 
309078 
335513 



321907 
337189 



308601 
305020 



AI719930 
AA627248 



322465 AA137152 
305601 



20 310087 AI393914 



328752 
337611 
334470 
335115 
328730 
330350 
336971 

308258 AI565612 
326745 
335440 

320257 AA330746 
328677 
329731 

315950 AA700553 



40 333458 



305134 
307058 
331943 
331116 
306094 
333561 
321439 
324594 
337926 
337353 
331836 
308981 



H84730 EST duster (not in UnlGena) with exon hit 0.135 

CH22_FGENES.823-29 0.135 

CH2*_FGENES.601j4 0.135 

CH22_FGENES.847-8 0.135 

AI920965 Hs.77961 major histocompatibility complex; class 1; 8 0.135 

OE2_FGENES.571_28 ai35 

CH22_DA59H18.QENSCAN.37-6 0.135 

Hs.148722 ESTs; Weakly similar to large tumor suppressor 1 [H.saplans] ai35 

CH2a_FGENES571-32 0.135 

CH.12_p2gIl5302817 ai35 

EST singleton (not in UniGene) wffli exon hft 0.135 

H&2064 vimenfin 0.135 

022_FGENES.293J 0.135 
Hs-3784 ESTs; Highly similar to phosphoserine aminotransferase 

[Rsaptens] 0.135 

AA780975 EST singleton (not In UniGene) with exon hft 0.135 

H10781 Hs.141051 ESTs; Moderately similar to III! ALU SUBFAMILY SB 

WARNING ENTRY 0.135 

CR05Jtsgi|5867968 0.135 
Hs.160624 ESTs; Weakly similar to similar to CR16; SH3 domain 

binding protein ai35 

CK07_hsgi[5868298 0.135 

CH22_C20H12.GBJSCAN.194 0.135 " 

CH22_FGENES.394_1 0.136 

CH2a_FGENES.496_2 ai38 

CR07_hsgi|5868289 0.136 

CR09_p2gi]3056S22 0.136 

CH22_FGENES.37*6 ai36 

EST singleton (not in UniGene) with exon hft 0.136 

CR2OJtS0i|5867611 0.136 

CH22»FGENES.560_3 0.136 

EST duster (not in UniGene) 0.136 

CR07_hsgil58S8256 0.138 

CH.14_p2gil6065783 0.138 

H&206974 ESTs 0.136 

CH.17_p2gJ]4567182 ai38 

CH22_FGENES.448-3 0.138 

K&31059 EST ai36 

H&232820 EST 0.136 

CH2^FGENES.157„7 0.136 

Cai5j>2gll6563505 0.136 

HS200133 ESTs 0.136 

CH2ajH5ENES^10J5 0.136 

Hs.75514 nucleoside phosphorytese 0.136 

Hs.185588 ESTs 0.136 

K&3003 CD3E antigen; epsiton polypeptide (TTT3 complex) 0.136 

EST singleton (not In UniGene) wBh exon hit 0.136 

Hs.191109 ESTs 0.136 

CH2a_FGENES^89_15 0.136 

Hs.100725 EST 0.138 

EST singleton (not in UniGene) with exon hit 0.138 
Hs.8594 Homo sapiens mRNA containing (CAG)4 repeat; done CZ-CAG-7 0.136 

CH22_FGENES.403_8 - 0.136 

CH22_FGENES.543_26 0.138 

CH22JGENES.839_8 0.136 

CH22_FGENES.465_20 a 136 

CH.16_hsgl|5867087 0.137 

EST singleton (not In UniGene) with exon hit 0.137 

EST singleton (not In UniGene) with exon hit 0.137 

Hs.178272 ESTs 0.137 

Hs£2634 ESTs 0.137 

EST singleton (not in UniGene) with exon hit 0.137 

CH22J=GENES.180_18 0.137 

H61962 EST cluster (not in UniGene) 0.137 

AA497090 EST duster (not in UniGene) 0.137 

CH22^EMAC00550aGENSCAN.77-4 0.137 

CH22LFGB^ES.726-1 0.137 

AA412295 Hs.104774 EST 0.137 

A1873242 EST singleton (not in UniGene) wfih exon hfi 0,137 
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329424 CH.YJis gt[5868979 0.137 

325829 CH.15_hsgi]5867Q52 0.137 

331845 AM16863 H&9B183 ESTs 0.137 

333854 CH22J=GENES.290 13 0.137 

5 306591 AI000248 EST singleton (not In UniGene) with exon hit 0.137 

328948 CH.08J1S gi|6456765 0,137 

338935 CH22JXJ32I10.GENSGAN.18-12 0.137 

325960 CH.16_hsgi|5867147 0.137 

328377 CH.07_Jis gl|5868390 0.138 

10 308851 AI829820 EST slngteton (not in UniGene) with exon hit 0.138 

314620 AA424352 K&210588 ESTs 0.138 

337592 CH22_C20H12.GENSCAN.6-7 0.138 

338684 CH2a_EMAC005500.GBJSCAN.472-3 0.138 

331800 AA400488 Hs.97543 ESTs 0.138 

15 304587 AA505535 EST singleton (not in UniGene) with exon hit 0.138 

333981 CH22J=GENES.310_4 0.138 

332452 AA040369 Hs.11170 SYT Interacting protein 0.138 

305752 AA835278 EST singleton (not in UniGene) with exon hit . 0.138 

311947 T65554 Hs251591 EST 0.138 

20 333783 CH22_FGBlES273_5 0.138 

337406 CH22_FGENES.754-14 0.138 

327976 CH^6J)sgi|5868212 0.138 

325593 CH.13Jisgi|5866992 0.138 *" 

339425 CH22_DJ579N16.GENSCAN.14-4 0.138 

25 304475 AA428879 EST singleton (not in UniGene) with exon hit 0.138 

309488 AW131104 EST singleton (not In UniGene) with exon hit 0.138 

337532 CH2?_FGENES£27-6 0138 

317234 AA904448 Hs.126368 ESTs 0.138 

312261 AA854425 Hs.144455 ESTs 0.138 

30 328927 CH.08_hs gi]5B68500 0.138 

336424 CH22_FGENES£24_9 0.138 

326667 CH^0_hsgf|6552455 0.138 

325988 CH.16_hsgi|5867064 0.138 

318446 AW300287 EST duster (not in UniGene) 0.139 

35 338511 CH22_FGENES.834_6 0.139 

335204 CH22_FGENES508 13 0.139 

303244 AA147472 EST cluster (not in UniGene) with exon hit 0.139 

330870 AA115804 Hs.187593 ESTs 0.139 

329376 CHJUisgi]5868859 0.139 

40 304703 AA563898 EST singleton (not to UniGene) w3h exon hit 0.139 

333553 CH23J=GENES239_2 0.139 

306799 AI051698 EST singleton (not in UniGene) with exon hit 0.139 

304872 AA595289 EST singleton (not In UniGene) wfih exon hit 0.139 

330812 AA013001 Hs£0563 ESTs 0.139 

45 329568 CH.10_p2gi]3962490 0.139 

319210 AA253074 Hs. 146261 ESTs 0.139 

334320 CH22_FGENES.374_5 0.139 

300860 AI916949 Hs.149748 ESTs; Weakly simHar to weak similarly to coDagens [Cetegans] 0.139 

305868 AA864533 EST singleton (hot to UniGene) wflh exon hft 0.139 

50 312943 AA984364 Hs.119064 ESTs 0.139 

330523 M99439 Hs33958 transducirHikB enhanoer of split 4; homoiog of Drroophila E(sp1 ) 0.139 

312708 AI076204 Hs.135440 ESTs 0.139 

309366 AW072970 EST singleton (not in UniGene) with exon hit - 0.139 

303273 AA316069 EST duster (not in UniGene) with exon hit 0.139 

55 317484 AW274696 Hs.143921 ESTs 0139 

333239 CH22_FGENES.111J 0.139 

307126 AI184951 EST singleton (not in UniGene) with exon hit 0.139 

316813 AA826505 Hs.124517 ESTs 0.139 

331746 AA281365 Hs.121640 ESTs; WeaMy similar to K1AA0386 [H^aplans] 0.139 

60 308558 AI700145 Hs. 172 182 poiy(A)-bincing protein; cytoplasmic 1 0.139 

310784 AWQ86142 Hs.159017 ESTs 0.139 

323831 AA335715 H&200299 ESTs 0.139 

307692 AI318342 EST singleton (not in UniGene) with exon hit 0.139 

310570 A1318327 EST duster (not In UniGene) 0.139 

65 327934 CH.06J»gi|5868184 0.139 

305232 AA670052 Hs. 195 188 otyceraldehyrfe^phosphatB dehydrogenase 0.139 

334756 CH2ajGa€Sv428J5 0.139 

331938 AA451857 Hs.99255 ESTs 0.139 

301393 AM74722 Hs, 150898 ESTs; Weakly simitar to K1AA0644 protein (H^aplensl a 139 
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312005 
338431 
331214 
333601 
323481 
336911 
338157 
327845 
319109 



302996 
323751 
329916 
301993 
338129 
325704 



331673 
316807 
310743 
326941 



304705 



333747 
318287 
332972 
305704 
315699 
327296 
336400 
321033 

316522 
335715 



337382 
322346 
325378 
338500 
338460 
315279 
314439 



T78450 Hs.13941 
T90496 Hs.16757 
AA278449 Hs.137429 



Z45662 HSS0797 



AF054663 

AW452656 H&209824 
N49826 Hs.18602 



W72366 H&40033 
AI018331 Hs.172444 
AW449754 Hs.158665 



AI653164 Hs.128665 
AA564064 



AW015616 Hs.143321 
AA825266 

AW182805 Hs.189183 



H26214 Hs20733 
AI475995 Ks. 1229 10 

AA227618 Hs.10882 



330117 
338017 
337854 
329984 
305004 
302815 
327823 
326753 
301201 
334303 



311050 
308740 
331003 
338010 



318100 
320641 



ESTs 

CH22_EM^C005500.G£NSCAN3514 
ESTs 

CH22_FGENES.213_4 
ESTs 

CH22.FGENES-344-4 
CK22.EMAO005500.GENSCANm5 
CK05J1S Q56531962 

Homo sapiens done 23620 mRNA sequence 

CK22_FGENES.428J2 

CHJLhsgi|5868869 

EST duster (not in UniQene) with exon hit 

ESTs 

CH.16_p2gi|6223624 
ESTs 

CH2SLEMAC005500.GENSCAN.197-2 

Cai4Jisgi|5867Q28 

CH22 W FGENES.590_7 

ESTs 

ESTs; H fcjhfy similar to transcription regulator [Mjraiscuhjs] 
ESTs 

CH21_hs gl]6004446 
CH.07_hsgiJ5868327 
ESTs 

EST singleton (not in UniQene) with exon hit 

CH.14_hsgi|6469822 

CH22J=GENES.265_6 

ESTs 

CH22J=GENES£1_5 

EST singleton (not in UnlGene) wfth exon hit 

ESTs; Weakly similar to Nodi (risapiens] 

CR01„hsgl|5867492 

CH22J=GBIES.823_15 

ESTs; Weakly similar to 1111 ALU SUBFAMILY SX 

WARNING ENTRY 



AW511138 H&256581 
AI539443 Hs.137447 



AA622328 Hs.162762 
N40373 



AA904482 Hs.197775 



AI864581 Hs.215477 
A1802711 HS210337 
H63959 Ks.142722 



R44308 H&242302 
R55421 



330425 HQ172B-HT1734 



CH22J=GB4ES^99J5 

CH22J=GENES.650_2 

CH2£J=GENES.118JT 

CH22JH5BIES.744* 

HMG-box containing protein 1 

Cai2_hsgil58S6920 

CH22_EMACO0550O.GENSCANmi 

CH22„EM^C005500.GENSCAN^62-5 

ESTs 

ESTs 

CH22_FGENES222_3 

CHXJisgl|5868729 

CH.19ji2gl|6015201 

CH22LEMAC005500.GBJSCAN.134-1 

CH22^EMAC005500.GENSCAN.38-12 

CH.18j)2gI|4646193 

EST 

EST duster (not in UnlGene) with exon hit 

CH.05_hsgl|5867968 

CR20J»gi|5867816 

ESTs 

CH22_FGENES^73J 

CH.19_hsgi)5867399 

ESTs 

EST; Weakly similar to aldolase A [H .sapiens] 
ESTs 

CH22_EM^C005500.GENSCAN.12M 

CH22J=GB1ES.812J 

ESTs 

EST duster (not in UnlGene) 
CH.16jisgi|5867067 

Non-Specific Cross Reacting Antigen (Gb:D90277), 
Att.SpDc8Form2 
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324583 AA425411 Hs22581 ESTs 0.142 

328268 CH.17_hsgIJ5B67267 0.142 

331390 AA460341 Hs.45008 ESTs 0.142 

338904 CH22JDJ32I10.GENSCAN.10-16 0.143 

5 333096 CH2a.R3ENES.79J 0.143 

331919 AA446869 Hs.119316 ESTs 0.143 

312214 AI248004 Hs.125187 ESTs 0.143 

323198 AW179174 Hs.7884 ESTs 0.143 

316107 AI204001 Hs.184014 ribosomal protein L31 0.143 

10 301335 AA885317 Hs.190511 ESTs 0.143 

337392 CH22JGENES.747-3 0.143 

325543 Cai2_hsgi]6682452 0.143 

305903 AA873085 EST singleton (not in UniGene) with exon hit 0.143 

332707 L35594 Hs.174185 phosphodiesterase l/nudeofide pyrophosphatase 2 (autotaxin) 0.143 

IS 337913 CH22_E?vtAC005500.GENSCAN^9-10 0.143 

301436 AA961061 H&131696 ESTs 0.143 

335078 CH23J=GENES<486_5 ai43 

338451 CH22_EM^COO55O0.GENSCAN^59-39 0.143 

302777 AJ230640 EST duster (not in UniGene) with exon ha 0.143 

20 330464 J03068 Hs.78223 N-acylamInoac^H>Qptide hydrolase 0.143 

330988 H41411 Hs.33855 ESTs 0.143 

328939 CH.08Jegi|6004481 0.143 _ 

308015 AI440174 H&228907 EST; Weakly sim&ar to GUANINE NUCLEOTTDE-&INDING 
PROTEIN BETA SUBUNfNJKE PROTEIN 

25 12.3 [Rsapiens] 0.143 

328504 CH.07jlS9i|5868471 0.143 

332599 AA402891 H&32951 solute carrier family 29 (tiucieoside transporters); member 2 0.143 

335744 CH22.FGENES.601J5 0.143 

322394 AF077208 EST cluster (not in UnlGene) 0.143 

30 323892 AL042661 EST duster (not in UnlGene) 0.143 

318443 AI939323 Hs.157714 ESTs; Weakly similar to NEURONAL ACETYLCHOLINE 
RECEPTOR PROTEIN; ALPHA-5 CHAIN PRECURSOR 

[H^apiens] 0.143 

336568 CH22_FGENES.B43_7 0.143 

35 330958 H08815 Hs.159824 EST 0.143 

327372 CH.04Jisgi|5867843 0.143 

335900 CH22_FGENES.635_8 0.144 

336044 CH22LFGENESJ679JB 0.144 

318845 AI815951 Hs33183 ESTs; WeaMy slmDar to estrogen-responsive finger protein; 

40 efpfUsapiens] 0.144 

333483 CH22J=GENES.165 J 0.144 

333337 CH22J=GBIES.139J 0.144 

305993 AA889197 EST singleton (not In UniGene) with exon hit 0.144 

335719 CH22_FGENES.599_22 0.144 

45 325682 Cai4Jisgq6136923 0.144 

327350 Crt01Jisg?6249583 0.144 

339291 CH2^BA354I12.GENSCAN.18-1 0.144 

326358 CH.18_hsgi|5867293 0.144 

330316 CH.C8_p2 gi|5007576 0.144 

50 308150 AI499346 Hs.174131 ribosomal protein L6 0.144 

338065 CH22 EM*C005500.GEN SCAN. 164-1 0.144 

339009 CH22_DA59H18.GENSCAN.18-7 0.144 

327776 CrL05_hsgi|5867964 - 0.145 

336664 CH22J=GENES.41-8 0.145 

55 321921 AFO70619 EST duster (not In UnlGene) 0.145 

319346 T70147 Hs.12024 ESTs 0.145 

304265 AA062892 EST singleton (not In UniGene) with exon hit 0.145 

303818 Z45986 K&250178 copinetl 0.145 

327498 CH.Q2_hsgi|6017Q23 0.145 

60 335227 CH22J=GENES513J3 0.145 

339022 CH22_DA59H18.GENSCAN.22-1 0.145 

302597 H55661 Hs33026 ESTs; Wealdy similar to slmBar to Enterococcus faecafe 

TRAB[Oelegans] 0.145 

308550 AK97008 Hs501811 EST 0.145 

65 302175 AA262760 Hs.156015 Homo sapiens chromosome 1 9; cosmld R29381 0.145 

303252 AA156760 EST duster (not in UniGene) with exon hit 0.145 

337414 CH22J=GENES.757-2 0.145 

310382 AI734009 EST duster (not in UniGene) 0.145 

329333 CHJUisgl)5868806 0.145 
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EST singleton (not in UniGene) with exon hit 
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336567 


CH22_FGENES^43„6 


0.147 


335819 


CH22_FGENES-619_2 


0.147 


336950 


CH22_FGBES\381-8 


0.147 


307050 All4o477 


EST singleton (not In UniGene) wfih exon hft 


0.147 


310134 AW504OO4 


Hs.126714 ESTs 


0.147 


330334 


CH22_FGENES^1J 


0.147 


327870 


CH.06_hsgfl5868131 


0.147 


4440114 A A OQOA1 1 

323802 AA302U11 


H&250138 protein phosphatase 2C; magnesium-dependent; catalytic subuntt 0.147 


329412 


CHJUisgi|6882553 


0.147 


444*M4 AA444AS0 

323791 AA3330OO 


EST duster (not in UniGene) 


0.147 


324126 AA3oo31o 


EST cluster (not in UniGene) 


0.147 


327865 


CH.06 hsgi}5868130 


0.147 


333445 


CH22^FGENES.154_2 


0.147 


321302 AAQ213S1 


58497 WAA0724 gene product 


0.147 


336744 


CH22_FGENES.118-9 


0.147 


323731 AA323414 


EST duster (not in UniGene) 


0.148 


320289 H07939 


EST duster (not in UniGene) 


0.148 


305488 AA749000 


EST singleton (not in UniGene) with exon hit 


0.148 


305592 AA780594 


Hs.62954 terrain; heavy polypeptide 1 


0.148 


304094 H11295 


EST singleton (not in UniGene) with exon hit 


0.148 


325040 AW296368 


EST duster (not in UniGene) 


0.148 


339034 


CH22.DA59H18.GENSCAN26-2 


0.148 


334504 


CH22_FGENES.398_2 


0.148 


334778 


CH2zJ^ENES.431^2 


0.148 


320148 U77494 


Hs.119687 RAN binding protein 8 


0.146 


303584 AW173759 


H&203401 ESTs 


0.146 


325826 


Cai5jisgit5867048 


0.146 


331192 T551B2 


Ks.152571 ESTs; Highly similar to (GR1 mHNA-btnding protBln 2 [H^aplens] 0.148 
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325785 CaHJe gi|6381957 0.148 

333166 CH2£JGBiES.91_8 0.148 

336548 CH2LFGENES.841J5 0.148 

337552 CH2a.C4G1.GENSCAN.14 0.148 

5 331775 AA382742 H&97151 EST 0.148 

338936 CH22JXG2I10.GENSCAN.19-6 0.148 

331869 AA428554 Hs. 104894 ESTs; Weakly simitar to fibronecdn precursor [Haptens) 0.148 

332865 CH22_FGENES.28_5 0.148 

328663 CR07J»s gi|6004473 0.148 

10 328436 CH.07_hsgi|5868417 0.148 

311158 AI634864 Hs250789 ESTs; Highly similar to simitar to NEDD-4 [H sapiens] 0.148 

336942 CH22.FGENES.354-2 0.148 

302262 R53169 H&246091 ESTs 0.149 

333296 CH22J=GENES.132_3 0.149 

15 333365 CH22_FGENES.142_2 0.149 

311706 AW452392 H&252854 ESTs 0.149 

337109 CH22.FGBES.489-2 0.149 

315062 AW173300 Hs.190201 ESTs 0.149 

333454 CH22_FGENES.157_3 0.149 

20 334784 CH22_FGB1ES.432_9 0.149 

333255 CH22_FGF31ES.118_3 0.149 

337518 CH22_FGENES.814-7 0.149 

320651 AA489268 EST cluster (not in UniGene) 0.149 ~* 

323437 AA287567 EST cluster (not In UniGene} 0.149 

25 328761 CH.07Jlsgi[5868302 0.149 

328787 CH.07_hsgIl5868309 0.149 

335261 CH2^_FGBJESS20_? 0.149 

300827 R16689 Hs.106004 ESTs 0.149 

339263 CH22_BA354I12.GENSCAN.10-1 0.149 

30 337412 CH22LFGENES.756* 0.149 

334414 CH22LFGENES.384J 0.149 

332931 CH22.FGENES.38_5 0.149 

310801 AW270980 Hs.106346 novel centrosomal protein RanBPM 0.149 

305216 AA669056 EST singleton (not In UniGene) with exon hit 0.149 

35 314779 AM70122 Hs.190261 ESTs 0.149 

338414 CH22_EfAAC005500.GENSCAN341-27 0.149 

303342 AW247361 EST duster (not in UniGene) with axon hit 0.149 

337509 CH22_FGENES^064 0.149 

306631 A1001149 EST singleton (not In UniGene) with exon hit 0.149 

40 302533 L36149 Hs£481 16 chernoidne(C motif) XC receptor 1 0.149 

336536 CH22.FGENES.839J8 0.149 

324666 T32458 Hs.14285 ESTs 0.149 

310173 AI767433 Hs.170013 ESTs 0.149 

333595 CH22_FGENES_!11 _2 0.149 

45 335975 CH22_FGENES.652_9 0.15 

306654 AI003654 EST singleton (not In UniGene) with exon hit 0.15 

335025 CH22_FGENES475_3 0.15 

328711 CH.07_hsgi|5868271 0.15 

328274 CHD7_hsgi[5868219 0.15 

50 325505 CR12_hs gi|6682451 0.15 

329641 CH.14_p2giIS468233 0.15 

304955 AA613504 EST singleton (not in UniGene) with exon hit 0.15 

339103 CH22_DA59H18.GENSCAN.44M0 - 0.15 

329636 CH.12_p2gi|5302817 0.15 

55 310118 A1203293 Hs.157489 ESTs 0.15 

326056 CK17Jisgif5867184 0.15 

303773 AA769074 EST duster (not in UniGene) with exon hit 0.15 

303153 U09759 H&8325 niitogen-acfivated protein kinase 9 0.15 
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TABLE 13A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 13. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
5 Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 
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Ptey: 

CAT number 



Unique Eos probeset Identifier number 
Gene cluster number 
Genbank accession numbers 



Pkey CAT number Accession 



322050 24275.1 
321439 1599424.1 
20 321666 13853.22 



25 



300088 622937 J 
322303 704603.1 
322394 27492J 



321758 44275J 
323109 155498J 



45 322533 38937J 
321921 34680.1 
321927 21620J 



321932 265316J 
306971 14694.7 



AL1 37589 AA423949 BE222949 BE222694 A1199615 AW8731 16 AI277950 AW044290 AW630096 
H61962W01567 N76711 

BE259906 AA232518 AA013359 AL035788 AW160822 BE387134 BE0Q2954 BE391839 AW161555 AI878641 BE616458 
BE409981 BE387308 BE297436 BE315536 AA206924 R12012 AA214169 BE312812 BE387093 H11710 BE312009 
BE260569 AA343566 AA219526 R34757 AA219749 BE336733 AA219751 AW411Q99 AA232408 BE018716 BE398089 
AA206253 AAQ53487 AA114224 AV655B68 AW732566 BE394087 AW732574 AA313442 BE336875 AA070548 BE259840 
BE019828 AW732341 AA299916 BE019253 BE018238 BE387109 AA232304 BE255589 AW732585 AA181436 AA308777 
AA075802AW732521 AA314526 AA226747 BE40951 3 AA206168 BE388292 BE298782 BE387086 AA30531 0 AV652723 
AA314918 8E615510 AW951763BE398104 BE385195 BE407165 BE391336 BE3901B7BE389189 BE540S50 BE249884 
BE385985 BE274245 BE391124 BE260080 AA1 82600 BB 12821 BE390090 BE279398 BE279589 BE263454 BE51S194 
BE293569 BE272531 BE388814 BE384659 BE271685 BB61043 BE278449 BE302572 AW239076 AI750583 AA376179 
AA1 12632 BE266324 BE266614 R 131 05 M132286 BE296305 A1220355 AA206606 AA219527 AA21951 9 AW804310 
AA083266 BE171208 T19693 AA338328 BE185868 AA903Q24 T92162 AA3301 19 BE410404 BE314668 
AW57B245 BE207878 AW299993 AI199558 AI285442 AW299994 AW394242 AW394184 
AI357412 A1870708 AI590539 W07459 

AW068287 AA310079 BE3367Q2 AA356318 AA306059 AA346785 AW402633 AA311210 AW402909 N76879 AW402913 
AW401920 AA321636 AA354474 C17297 C16938 AA311774 M29871 NM.002872 2B2188 AW405674 H94176 R89281 
AA214723 AJ014482 AW949347 T27749 AW804226 AW796964 AW404581 AF077208 NWL014Q29 W68830 W79852 
AA353375 AW57521 8 AA552192 AAS21232 AA702695 AA033975 AW407827 AA829948 N94402 AW628604 A1523308 
N57605 AA641662 H42477 N52784 AT753478 AA768493 AA845729 W47391 N55270 AI090117 R89282 BE206172 
AA076650 AA595650 AI218931 BE049397 AI4331 10 W741 14 H94277 AI358627 AI085221 AI862818 AA835967 AW103905 
A1640644 AA835507 AA856887 AA694392 AW337542 AI524410 BE045500 AJ440060 A!358801 AW028238 AW205248 
AT71 8264 R4861 8 AA357358 AI695002 AA897549 AW081065 A1433360 AI810783 A1620963 Z82188 AA36Q224 
U291 12 AI656540 AI364875 A1656248 AE990940 

AA169345 AT762857 AI949997 AI809601 Al 581 948 A1221 079 AW167404 A1347614 AI61 1050 AI023472 AI347633 A1027467 
AW591788 AJ380665 AA835735 AA836654 AI244028 AW193159 AI5001 12 A1918722 AI738693 AT7Q2308 AA805365 
AI766842 

T59538 T59589 T59598 T59542 AF147374 
AF070619 R203Q2 T80358 

AJ223366 BE305086 AW820106 AA621983 BE305208 AI738475 AI380189 AW590847 A! 127232 AA622706 AI380858 
AA621975 AB87036 AAS65743 AW204003 A1692234 AKX32242.AI692219 AW137282 AW268783 AW295910 AI308015 
AW301462 AI318288 AI318575 A1318117 A1345591 A1249650 AE46934 A1246864 A246971 AW26831 1 AI249654 BHM1907 
AW732776 

N72324 N52825 W19526 BE143464 AA376060 

M83887 NM.005195 S63168 M83667 AW068039 AW630649 AI338577 AI01 81 25 AI269878 AW242440 AI887823 A1342581 
BE222416 AI582847 AI651011 A1660815 AI699574 BB50201 AI926996 AW665855 AI827752 AI761857BE328168 
BE222451 AI762201 AW000929 AW007207 BE042982 B551843 BE465373 AE79179 AI949945 BE551862 AW051667 
BE328076 BE222286 AW007229 AW772332 A1279801 AI934526 A1631938 AI770103 BE041412 AI417900 AI692655 
A1869943 AW2701 19 AI431739 AI703347 AW770568 AW025473 AI701497 All 28026 BE328147 AW203980 BE046793 
AW087704 AI874597 AI650732 AIB13691 A1472092 AI695224 A1241217 AW207746 A1206840 AI271362 AI631788 AI911883 
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AA868774AA599660 
AA780365 AA909233 A1275542 
AA210878 AA215684R11101 

M13560 AA336951 AA161015 R72814 T69687 R75705 T61319 AA158454 R50579 T56649 AI2141 56 T70375 R31655 
H64997 AW800487 H49110 AA634206 H42384 H21783 AI560152 AA664230 H42302 R48708 AA01 3277 T61 901 T92417 
AA875985 T61 962 T63055 AA430725 AA458964 AA578746 AI582385 T63000 AI499875 H64998 AA022538 AI364804 
A1865211 A1439714 AI224059 AI249917 T59258 AA477806 AA715834 AA916120 R38304 R35899 R82985 H25524 H82984 
AW516728 T54642 AA079866 H27555 AA455820 T6391 9 R79450 A1431241 AA937349 AA127213 AA421729 H61 1 96 
T63894 AA013050 AA079133 W93364 AA487926 AI762796 H26377 AI433388 AI865423 AW371475 R98189 AA643978 
A1718204 AW381954 AI862735 
AA323758R12731 R14082 

R17531 AW960899 AA338366 AW673294 BE047729 BE047722 AA330746 AW841797 H05030 A1142105 R12654 
H07989 AJ239462 H24544 AA078369 R74153 

BE512926 BE304794 AA129140 AA052922 AA092258 BE378058 BE615391 BE615218 BE616188 AI214126 H05675 
W56857 AI028525 BE61 7241 BE531271 AW856227 T56489 AA322005 AW794148 AF170577 BE615738 AA005138 L76930 
L76932 176933 X95410 AW38946B BE563092 AW997937 AA263158 AI520992 AW947350 AAS22535 AW945921 AV653776 
AW884835 AW947338 Al 687 178 AW945799 A1905627 AW948449 AV653751 AW945924 AA563898 AW945810 AW945832 
AW371449 AW945864 AW948447 AW945910 AA643002 AA522680 AA522715 AA578840 AA523279 AA826150 AW945809 
AW405998 AA551909 R23173 AA595545 AW389497 AI933770 AI125053 AI471803 AW795856 AW796937 W30675 H70317 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



H 68296 T59240 AA397550 H59852 AA938072 AA97801 0 R35643 T89735 AW381585 AW1 96153 AI538069 AA604540 
A1434259 R491S1 T58717 AW062486 AW796866 A1646384 R77733 AI623502 BE171342 BE171303 R35658 AW974883 
AW1 49698 AS500045 A1540710 AI54Q392 AW009172 AW277199 A1371312 AB00096 AI47G297 AW372940 AW844562 
AWB44560 AW7979S5 AI691 146 X07062 AW799199 H60666 AA837684 AF1 30734 T25952 A1933771 AI914860 AW391 925 
AW793843 AW795012 AW366709 AW750987 AW75Q9B5 R35765 AW844942 AW750986 H64920 R34651 X867Q3 
321039 26338J BE018103 BE018083 BE293253 AW247083 BE207643 BE514793 BE183238 AA376427 AW273850 AW04378S BE439973 

AL045428 AJ889050 AA026496 AI422924 A1884485 W96068 AA020872 F37119 AA714378 AA0211 07 AA011141 A1554001 
AI375841 AI469097 AA335219 AW967315 AI692177 AA410448 AI568B58 AA582647 AA026419 AA281639 AW515248 
AWD07777 AA010840 AW188439 AI805423 AI148210 BE301590 AA744414 AA745392 AW167423 AA622659 AW000878 
A1432387 AA760930 BE047189 AA021 605 AV658045 AI093347 AA588594 K63143 AA639556 A1308976 AA379270 
AA633407 AI874329 AI206484 A1493895 AI6941 03 AI249682 AA973765 AA872445 A1125446 AA287272 AW0S9761 
AA682569 AW009712 BE542774 R50167 BE301574 AA991202 AA502006 A1219819 AW074373 AAB17996 AI52 1 242 F25241 
AW615812 R16774 AA335218 AW873800 H26778 AI468557 AI886986 A1560759 AI460075 AA502968 AA503273 AA610680 
AA287274 AA554020 AA284889 AA916638 AW469457 AW273250 AW673708 AW512948 AL041071 AI446042 AA903535 
BE172441 AI282411 AW265Q21 AA810799 AI559865 AA729332 AW00461 1 AW129451 AA659019 BE208239 AA610825 
H03511 BE383995 R1 6474 AA281 701 AW009244 AA287424 AA558139 AW364081 

F08147 AW408359 AW949429 R23785 AW247442 AA305512 T29095 AA905130 BE246361 BE244981 AA220199 BE504058 
X80878 AA533727 AA608601 AW005964A1811627A1367037 AI277985 AW93719 A1277848 AA854982 AW247298 AI216345 
AI041295 AI887378 AA781241 AI674270 AW528959 AI383083 B £504391 AA729421 AA552188 AA373387 AW880360 
AW875262 AW875369 AW581540 AW875358 AW581568 R23735 AW134768 
W03912 AW971410 AA506385 AA209530 H73495 H48629 W56149 
H56752 AW340384 N49521 

AA853680 AK001668 BE386425 BE563549 BE296124 BE298950 R51419 U46285 BE147292 AA360055 R48018 AW845348 
N47383 AI817280 AI671902 AA988104 AA479464 N56996 Al 192374 AI927558 AA659888 AI799903 AA548397 AI161 167 
AI656333 AI418829 AW592671 BE327906AW513346 AI888579AW469410 AW512809 D25682 AA576079 AA479354 
T30342 R51307 T16044 H29063 AW079357 AI339477 R47914 AI986068 AI870065 AI868489 AI521099 AI582732 AA995540 
AW957299 AA352608 AA676752 AA4 105 10 AA358874 AJ8S5724 AA853679 A1B99265 AW188789 N47380 AA233715 
BE258194 R55421 R55643 H42382 AA243884 

AW886407 AA489268 R57015 R58094 BE077459 BE077423 BE546995 AW849216 T69383 AW938111 H60337 BE221073 
AB033100 AA347036 BE260325 AW961669 AL047207 AA347037 AT766894 AA601045 AI559897 AW139033 AW274622 
AW172884 AWD89070 AA804340 AW798925 
AA825266 

AL137354AL043375 
AA971985 
AA977992 
AA989542 
AA989598 
AA989713 
AA991487 
A1000248 
A1000248 
AI001149 
AI003654 
AI041589 
AI051698 
AM52732 
A1470948 
AI475914 
AI055966 
A1066577 
AI086929 
AI095365 
AJ127883 
AI559492 
AI565612 
A1571211 
AI581855 
AI591235 
AI687580 
AI719930 
AI735634 
AI744063 
308814 AI819263 
308851 AI829820 
308981 AI873242 
310570 1071946J AI318327 AI318328AI3184S5 
305022 AA627416 
305060 AA635771 
305070 AA639783 



306051 19085.3 



321163 171122J 
321235 1102181J 
320603 4297 J 



320641 185591J 

320651 58648.1 

321325 28266J 

305704 464759.-1 

322011 23158J 

306407 

306454 

306516 

306518 



306590 
306591 
306631 
306654 
306786 
306799 
308023 
308070 



306805 
306814 
306873 
306911 



308238 
308258 
308289 
308311 
308332 
308511 
308601 
308612 



255 



WO 02/30268 



PCT/US01/32045 



305079 
305134 
303977 
305216 
305263 
305266 
305396 
305403 
305483 
305549 
305601 
305610 
305621 
305710 
305724 
305744 
305752 
307018 
307055 
307058 
305801 
305830 
305836 
305852 
305858 
305866 
305867 
307126 
305903 



AA641329 

AA653159 

AW512978 

AA669056 

AA 679467 

AA679772 

AA721052 

AA723746 

AA749000 

AA773530 

AA7B0975 

AA782319 

AA789095 

AA826544 

AA827608 

AA831819 

AA835278 

A1140639 

AI148477 

A1148709 

AA845997 

AA857665 

AA858043 

AA862455 

AA863103 

AA864533 

AA864572 

AI184951 

AA873085 



328803 qjjw 

328809 c_7_hs 

305949 AA884409 

328829 C_7Jis 

330021 c16_p2 

330024 c16_p2 

330028 c16_p2 

330049 c17_p2 

305993 AAB89197 

330095 c19_J)2 

330096 c19_p2 

307205 AI192479 
307427 AI243437 
307491 AI268539 
307581 AE84415 
307588 AI285535 

337672 CH22jB002FGUUNK_BtAC00 

337693 CH22_6030FGL.UNieEMAC00 

337738 CH22L6083R3UJhiK^tAC00 
307692 AI318342 
307606 AI351739 
309107 AI925823 
309230 AI970747 

339338 CH22.8300FGL.UNK.BA35411 
309257 AI9841B3 
309366 AW072970 
309422 AW087175 

325207 clOJis 

325257 cHJis 

309646 AW194694 
309651 AW195850 

325313 dljis 

309924 AW340812 

334030 CH22_1308FQL320J2JJNK.B^ 

334040 CH22_1318FGL322_8JJNICEM 

334083 CH2^1361FGL327J38JJNKJE 

332810 CH2?J26FGL7J2JLINKJC65E1 

302747 32813 1 AF062275 L03830 

302753 33029 1 M74299 M74302 M74303 

302777 33803.1 AJ230640 AJ230648 
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304094 
302824 



304240 
304410 
304443 
304475 
304522 
304678 
304705 
306004 
306008 
306013 
306082 
336174 
306094 
304823 
304872 
304918 



35372.1 
41198J 
c16J»s 



H11295 

U21260 U21258 
AFQ54663AF124197R70292 

AA0098Q2 
AA284508 
AA399444 
AA428879 
AA465405 
AA548556 
AA564064 
AA889992 
AAB94390 



AA908508 
CH2^3567FG_710JJJNK_DA 
AA908877 
AA584837 



306249 
306286 
306295 
306317 
306347 



306398 
330401 
330463 



330535 



AA602697 
AA613504 
AA933S40 
AA936892 
AA937331 
AA947909 
AA961144 
AA962086 
AAS70548 



entre3LD2B383 
460J 



1374.-8 
10404J 



NM.001055 AA332948 U26309 U09031 L19955 L10819 A3366043 XB4654 U71086 AV654451 AJ007418 AA053625 
BE168856 AA376730 H12694 AA810348 AA621972 AI818950 AV645367 AI819966 AA9106Q2 AW512449 H67893 AI310497 
AI304330 A1339217 AW193S88 AW438688 AI81 8970 AW316799 AA906527 AA777570 N47673 AI336428 AW945133 
A1038606 R29692 AW194197 A1304748 H12639 AA053178 AA493213 AA676958 AA1 13154 A1313469 AI368239 R93183 
W24532 U52852 U54701 AL046864 AA365795 
U11872 

U24488NM.007116 
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TABLE 13B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 13. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed 



Pkey. Unique number corresponding to an Eos probeset 

Ret Sequence source, The 7 digit numbers in this column are Qenbank Identifier (Gl) numbers 

Strand: Indicates DNA strand from which exons were predicted. 

NLposfiton: Indicates nucleotide positions of predicted exons. 



Pkey Re! 



Strand NLposltton 



33279 1 
332792 
332810 
332944 
332972 
333133 
333154 
333155 
333227 
33323Q 



333304 



333391 
333392 



333413 
333445 
333479 
333481 



Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Phis 



333517 



333566 
333572 
333588 



333600 
333601 
333607 
333612 



333835 



333642 
333647 



333654 



Dunham, L etaL Plus 

Dunham, LetaL Plus 

Dunham, I. etaL Plus 

Dunham, L etaL Plus 

Dunham, I. etaL Plus 

Dunham, I. etaL Pius 

Dunham,!. etaL Plus 

Dunham, LetaL Plus 

Dunham, I. etaL Plus 

Dunham, I. etaL Plus 

Dunham, I. etaL Plus 

Dunham J. etaL Plus 

Dunham, I. etaL Plus 

Dunham, I. etaL Plus 

Dunham, LetaL Plus 

Dunham, I. etaL Plus 

Dunham, I. etaL Plus 
Dunham, I. etaL 
Dunham, I. etaL 
Dunham, I. etaL 
Dunham, I. etaL 
Dunham,!. etaL 
Dunham,!. etaL 
Dunham, I. etaL 
Dunham, I. etaL 

Dunham, I. etaL Plus 

Dunham, I. etaL Plus 

Dunham, I. etaL Plus 

Dunham, I. etaL Plus 

Dunham, I. etaL Plus 

Dunham, I. etai Phis 

Dunham, LetaL Plus 

Dunham, LetaL Plus 

Dunham, LetaL Plus 

Dunham, LetaL Plus 

Dunham, Letal. Plus 

Dunham, I. eta!. Plus 

Dunham, Letal. Plus 

Dunham, LetaL Plus 

Dunham, LetaL Plus 

Dunham, LetaL Plus 

Dunham, LetaL Plus 

Dunham, LetaL Plus 

Dunham, L etaL Plus 

Dunham, LetaL Plus 

Dunham, LetaL Plus 

Dunham, LetaL Plus 

Dunham, L etaL Plus 

Dunham, LetaL Plus 

Dunham, LetaL Plus 

Dunham, LetaL Plus 



72720-73315 

73381-73768 

304296-304384 

2414825-2414932 

2572152-2572238 

3360058-3360195 

3615887-3616019 

3616832-3817003 



3995507-3996507 
4581537-4581947 
46299434630242 
4630388-4630645 



4907179-4907277 
4916697-4916780 
49182944918433 
49224664922635 
49251404925258 
49438244943974 
5097827-5097885 
5272855-5272939 
5286358-5286505 
5297945-5298105 
55702046570390 
5570729-5570925 
5571761-6572025 



595422*6954473 
6026896-6027169 
6246834-6247314 
6255445*255779 
63089906309450 
6323103*323348 
6355629*355925 
6360075*360442 
6504431*504690 



6550643*550748 



6595146*595244 
6614174*614467 



6674968*675134 
6708760*709139 
6772502*772779 
6811130*811392 
6816731*816993 
6822087*822406 
6831369*831445 
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333659 Dunham, L etai 
333684 Dunham, L eLaL 
333686 Dunham, LetaL 

333697 Dunham, L etaL 

333698 Dunham, LetaL 

333699 Dunham, LetaL 
333703 Dunham, LetaL 
333709 Dunham, LetaL 
333747 Dunham, LetaL 

3 33774 Dunham, LetaL 

333775 Dunham, LetaL 
333806 Dunham, LetaL 
333843 Dunham, LetaL 
3 33854 Dunham, LetaL 
333873 Dunham, LetaL 
333880 Dunham, LetaL 
333865 Dunham, LetaL 
333918 Dunham, LetaL 
333947 Dunham, LetaL 
333961 Dunham, LetaL 
333981 Dunham, LetaL 
333991 Dunham, LetaL 
333994 Dunham, LetaL 
334030 Dunham, LetaL 
334083 Dunham, LetaL 
334111 Dunham, I. etaL 
334135 Dunham, LetaL 
334218 Dunham, LetaL 
334249 Dunham, LetaL 
334262 Dunham, LetaL 
334264 Dunham, LetaL 

334327 Dunham, LetaL 

334328 Dunham, L etaL 
334340 Dunham, LetaL 
334454 Dunham, LetaL 
334504 Dunham, LetaL 
334508 Dunham, LetaL 
334512 Dunham, LetaL 
334582 Dunham, LetaL 
334659 Dunham, LetaL 
334721 Dunham, LetaL 
334723 Dunham, LetaL 
334730 Dunham, LetaL 
334774 Dunham, LetaL 
334778 Dunham, LetaL 
334851 Dunham, LetaL 
334885 Dunham, LetaL 
334902 Dunham, LetaL 

334905 Dunham, LetaL 

334906 Dunham, LetaL 
334910 Dunham, LetaL 
335018 Dunham, LetaL 
335025 Dunham, LetaL 
335033 Dunham, LetaL 
335044 Dunham, LeLaL 
335142 Dunham, LeLaL 
335157 Dunham, LetaL 
335160 Dunham, LetaL 
335174 Dunham, LetaL 
335168 Dunham, LeLaL 
335180 Dunham, LetaL 
335191 Dunham, LetaL 
335193 Dunham, LetaL 
335204 Dunham, LetaL 
335222 Dunham, LetaL 

335226 Dunham, LetaL 

335227 Dunham, LetaL 

335309 Dunham, LetaL 

335310 Dunham, LetaL 



Plus 


6836179-6836248 


Plus 


7169561-7169742 


Plus 


7177117-7177302 


Phis 


7203859-7203934 


Plus 


7205278*7205383 


Phis 


7206101-7206175 


Plus 


7215559-7215663 


Plus 


7229730-7229835 


Rus 


7605884-7606206 


Plus 


7716509-7716636 


Phis 


77299837730149 


Phis 


7877475*7877666 


Phis 


7978762-7978887 


Plus 


8029446-8029524 


Phis 


8133266-8133429 


Phis 


81519234152133 


Phis 


81543524154437 


Phis 


83071244307215 


Plus 


85798884579966 


Plus 


86179994618104 


Plus 


87823744782643 


Plus 


88374194837551 


Plus 


88527494852894 


Phis 


92884634288782 


Phis 


98370164837081 


Plus 


10279365-10279531 


Plus 


10457085-10457183 


Phis 


12680289-12680378 


Plus 


13180430-13190574 


Plus 


13231452-13231581 


Plus 


13234447-13234544 


Plus 


13577413-13577496 


Plus 


13589868-13589936 


Plus 


13642407-13642522 


Plus 


14326506-14326738 


Plus 


14510206-14510398 


Plus 


14514936-14515122 


Plus 


14545933-14546366 


Plus 


15026255-15026371 


Phis 


15460624-16460726 


Phis 


15796816-15796987 


Plus 


15805317-15805399 


Plus 


15967830-15967934 


Plus 


16251857-16252178 


Pas 


16276180-16276395 


Phis 


17820110-17820810 


Plus 


19233667-19233787 


Plus 


19317083*19317195 


Plus 


19322553-19322680 


Plus 


19323493-19323590 


Phis 


19398155-19398684 


Plus 


20688288-20688415 


Plus 


20743941-20744050 


Plus 


20753188-20753314 


Plus 


20842088-20842682 


Phis 


2146510521465186 


Plus 


21543302-21544341 


Plus 


21573388-21573497 


Plus 


21631301-21631447 


Plus 


21669116-21669328 


Plus 


21680807-21680876 


Plus 


21681110-21681183 


Phis 


2169220821692362 


Plus 


21750636-21750726 


Plus 


21885542-21 885603 


Phis 


21890838-21890930 


Phis 


21892145-21892289 


Plus 


22500158-22500276 


Plus 


22500714-22500831 
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335311 



335436 
335440 
335441 
335450 
335453 
335458 
335464 



335SX) 



335510 
335513 



10 



IS 335497 



20 



25 335656 



30 



335715 
335719 
35 335734 
335744 



Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Plus 
Plus 



335651 



335665 



335819 



40 335872 



336010 



336126 
336129 
336187 
336188 



45 335976 



50 



55 336371 



60 



336441 
338444 
65 338484 



336384 



Dunham, 1. etaL Plus 
Dunham, LeLaL Plus 
Dunham, L etaL Plus 
Dunham, LetaL Plus 
Dunham, I. etaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, L eta). Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, I. etaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, !. etaL Pius 
Dunham, LetaL Plus 
Dunham, I. etaf. Plus 
Dunham, LetaL Plus 
Dunham, LetaL Pius 
Dunham, I. etaL Plus 
Dunham, I. etaL 
Dunham, I. etaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, t etaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, L etaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, L etaL 
Dunham, L etaL Plus 
Dunham, LetaL Plus 
Dunham,letaL Plus 
Dunham, LetaL Phis 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 



Pius 
Plus 
Plus 
Plus 
Plus 



Plus 
Phis 
Plus 
Plus 



22501602-22501676 
22776222-22779516 
22809167-22809461 
22843040-22843184 
22918150-22918263 
22919072-22919339 
23427793-23427923 
23458702-23459017 
23460632-23460724 
23480190-23480270 
23483333-23483459 
23490034-23490143 
23500331-23500496 
24164386-24164545 
24167686-24167869 
24172082-24172161 
24176698-24176869 
24178236-24178326 
24219973-24220039 
24222975-24223118 
24224272-24224496 
25150005-25150061 
25317560-25317698 
25333211-25333369 
25333601-25333751 
25336315-25336406 
25342680-25342802 
25344098-25344287 
25345735-25345856 
25346313-25346447 
25454350-25454604 
25455442-25455625 
25565941-25566052 
25593936-25594101 



25716483-25716615 
26310772-26310909 
26356341-26356470 
26364087-26354196 
26820760-26820943 



27743843-27744029 
27752808-27753017 
27801321-27801391 
27809041-27809187 
27983788-27983860 
27988532-27988608 
28570239-28570330 
29556922-29557002 
30057891-30058105 
30062259-30062348 
30433494-30433585 
30434870-30435004 
30833814-30833788 
33968108*33968204 
33976308-33976504 



33995323-33995434 
34005784-34005964 
34007429-34007559 
34007879-34008159 
34012965-34013115 
34187606-34187663 
34190585-34190718 
34237425-34237505 
34267190-34267245 
34267504-34287572 
34271306-34271372 
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336552 
336553 
338567 



30 



338715 
336803 



336949 



10 



IS 336993 
337076 
337109 
337123 
337151 

20 337189 
337241 
337337 



337384 
2d 337398 
337414 
337418 
337461 
337480 
337482 
337483 
337490 
337522 



337552 
337584 
337611 



337738 
337928 
337827 
337935 
337944 
337954 
337996 



35 



40 



45 



50 



55 



60 



65 338514 



338016 
338174 
338176 



338277 



338316 
338323 
338324 
338386 
338398 
338410 
338414 
338460 
338481 



Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, I. etaL Pius 
Dunham, I. etaL Pius 
Dunham, LetaL Phis 
Dunham, LetaL Pius 
Dunham, LetaL Pius 
Dunham, LetaL Pius 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Pius 
Dunham, LetaL Phis 
Dunham, LetaL Plus 
Dunham, LetaL Pius 
Dunham, LetaL Pius 
Dunham, LetaL Pius 
Dunham, LetaL Pius 
Dunham, L etaL Pius 
Dunham, LetaL Pius 
Dunham, LetaL Plus 
Dunham, LetaL Pius 
Dunham, LetaL Pius 
Dunham, LetaL Pius 
Dunham, Letat Pius 
Dunham, LetaL Pius 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, I. etaL Phis 
Dunham, LetaL Phis 
Dunham, LetaL Plus 
Dunham, LetaL Phis 
Dunham, LetaJ. Phis 
Dunham, LetaL Phis 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, Letat Pius 
Dunham, LetaL Pius 
Dunham, LetaL Phis 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 
Dunham, LetaL Plus 



34356420-34356527 

34356683-34356753 

34428228-34426395 

34426521-34428637 

1896402*1896478 

3110198-3110314 

61069044106990 

6126661-6126786 

7745284-7745355 

8130457-8130612 

11035818-11035984 

12818687-12818891 

12875843-12875912 

13203550-13203973 

15096270-15096324 

19338177*19338679 

21166580-21166650 

22052874-22052942 

23106433-23106510 

24225887-24225954 

27280182-27280313 



30804624-30804780 



31585902-31586067 

31953012-31953205 

32014049-32014131 

32803968-32804028 

33219714-33219779 

33227865-33227946 

33237292-33237427 

33318571-33318644 

33963188-33963979 

34187269-34187368 

19497-19600 

945236-945452 

1482883-1483016 

3331236-3331313 

35759753576153 

386573*3865814 

6286377-6286470 

634303^6343172 

6534661-6534782 



6831483-6831620 

7445532-7445633 

7601363-7601520 

7863131-7863310 

12771102-12771268 

12774072-12774223 

14661936-14662015 

16167622-16167962 

16463958-16464539 

17089711-17089988 

17154655-17154792 

17155309-17155574 

18611213-18611407 

18953492-16953581 

19292807-19292916 

19345573-19345660 

20233372-20233488 



21142605-21143049 
21253847-21253974 
21379420-21379655 
21636361-21636509 
23540239-23540334 
23711167-23711241 
24219427-24219509 
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338660 Dunham, L eUL Plus 24387122-24387266 

338704 Dunham, I. etaL Plus 25230432-2529)548 

338847 Dunham, L etaL Plus 27095337-27895420 

338887 Dunham, L etaL Plus 28465244*28465384 

5 338895 Dunham, L etaL Plus 28538893-28599135 

338915 Dunham, L etaL Plus 28824881-28824977 

338925 Dunham, Letal. Plus 28883892-28884036 

338938 Dunham, L etaL Plus 29148022-29148160 

338952 Dunham, Letal. Plus 29418831-29418988 

10 338980 Dunham, I. etaL Plus 29895789-29896874 

338981 Dunham, I. etaL Plus 29897917-29898008 

338986 Dunham, I. etaL Plus 30007287-30007415 

339009 Dunham, I. etaL Plus 30348477-30348598 

339017 Dunham, I. etaL Pius 30420896-30421090 

IS 339045 Dunham, I. etaL Plus 30744286-30744356 

339046 Dunham, 1. etaL Plus 30746269-30746420 

339059 Dunham, Letal Plus 30814655-30814801 

839067 Dunham, I. etaL Plus 30869347-30869412 

339069 Dunham, I. etaL Plus 30880975-30881070 

20 339078 Dunham, Letal. Plus 30914310-30914423 

339084 Dunham,!. etaL Plus 3094455*30944803 

339101 DunhanUetal. Plus 31158047-31158123 

339102 Dunham, Letal. Plus 31169321-31169563 

339103 Dunham, Letal Plus 3117034341170454 
25 339115 Dunham, Letal Plus 3145986941459927 

339157 Dunham, Letal Plus 32131701-32131833 

339166 Dunham, Letal Plus 32210902-32211008 

339167 Dunham, Letal Plus 32213567-32213730 
339288 Dunham, Letal Plus 33169811-33169691 

30 339289 Dunham, Letal Plus 3316675*33186903 

339291 Dunham, Letal Plus 33205057-33205247 

339407 Dunham, Letal Plus 34189461-34169620 

332665 Dunham, Letal Minus 1391482-1391218 

332881 Dunham, Letal Minus 1563520*1563184 

35 332930 Dunham, Letal Minus 2022585-2022497 

332931 Dunham, L etaL Minus 2023851-2023562 

332984 Dunham, Letal Minus 263260*2632457 

332986 Dunham, Letal Minus 2635398-2635208 

332997 Dunham, L etaL Minus 2710509-2710375 

40 333051 Dunham, Letal Minus 299197*2991840 

333061 Dunham, Letal Minus 3029831-3029527 

333064 DunhanUetaL Minus 3030722-3030623 

333096 Dunham, Letal Minus 3184234-3184118 

333069 Dunham, Letal Minus 320679*3206674 

45 333106 Dunham, L etaL Minus 3230744-3230547 

333160 Dunham, Letal Minus 365489*3654678 

333163 Dunham, Letal Minus 36651244664962 

333165 Dunham, Letal Minus 36740524673905 

333166 Dunham, Letal Minus 36946644694587 
50 333170 Dunham, Letal Minus 37333944733299 

333174 Dunham, Letal Minus 37642844764210 

333188 Dunham, Letal Minus 38269904826863 

333214 Dunham, Letal. Minus 39665594966437 

333232 Dunham, Letal. Minus 4001551*4001365 

55 333237 Dunham, Letal. Minus 400332*4003219 

333239 Dunham, Letal Minus 4095861-4094462 

333255 Dunham, I. etaL Minus 42978834297716 

333259 Dunham, Letal Minus 43067694306639 

333274 Dunham, 1. etaL Minus 438914*4388954 

60 333290 Dunham, Letal Minus 45307344530554 

333295 Dunham, Letal Minus 454929*4549198 

333298 Dunham,!. etaL Minus 4550766-4550644 

333310 Dunham, L etaL Minus 4837315-4637232 

333311 Dunham, Letal Minus 463793*4637844 
65 333312 Dunham, Letal Minus 4638794-4638635 

333313 Dunham, Letal Minus 4639397-4639277 

333315 Dunham, L etal Minus 540598*5405876 

333318 Dunham, Letal Minus 464263*4642564 

333321 Dunham, Letal Minus 464908*4648934 
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333327 



333337 
333454 
333458 



333470 



10 333498 



333510 
333546 



333738 
333780 
333783 
333818 



334040 
334154 
334178 
334188 
334273 



334303 
334305 
334306 



334359 



15 



20 333900 



25 



30 334285 



35 



40 



45 



50 



55 



60 



65 



Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 



Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 



334409 
334414 
334470 
334483 
334489 



334501 



334543 



334745 
334756 
334758 
334761 



334784 
334760 



334832 
334842 
334844 



Dunham, I. elaL Minus 
Dunham, I. elaL Minus 
Dunham, I. elai Minus 
Dunham,!. elaL Minus 
Dunham, 1. etai Minus 
Dunham, I. elaL Minus 
Dunham, I. elaL Minus 
Dunham, I. etai Minus 
Dunham,!. elaL Minus 
Dunham, I elaL Minus 
Dunham, LetaL Minus 
Dunham, L elaL Minus 
Dunham, I. elaL 
Dunham, L elaL 
Dunham, L elaL 
Dunham, I. elaL 
Dunham, L elaL 
Dunham, L elaL 
Dunham,!. elaL 
Dunham, L elaL 
Dunham, L elaL Minus 
Dunham, I. elaL Minus 
Dunham, I. elaL Minus 
Dunham, L elaL Minus 
Dunham, I. elaL Minus 
Dunham, I. elaL 
Dunham, I. elaL 
Dunham, LetaL 
Dunham, L elaL 
Dunham, LetaL 
Dunham, I. elaL 
Dunham, L elaL 
Dunham, LetaL Minus 
Dunham, I. etai Minus 
Dunham, I. elaL Minus 
Dunham, I. elaL Minus 
Dunham, I. etai Minus 
Dunham, I. etai Minus 
Dunham, LetaL Minus 
Dunham, I. etai 
Dunham, I. etai 
Dunham, L etai 
Dunham, I. etai 
Dunham, I. etai 
Dunham,!, etai 
Dunham, LetaL Minus 
Dunham, L elai. Minus 
Dunham, LetaL Minus 
Dunham, LetaL Minus 
Dunham, I. elaL Minus 
Dunham, I. etai 
Dunham, I. elai 
Dunham, I. elai 
Dunham, L elai 
Dunham, L etai. 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL Minus 
Dunham, L etai Minus 
Dunham, L elai Minus 
Dunham, LetaL Minus 
Dunham, LetaL Minus 
Dunham, LetaL Minus 
Dunham, LetaL Minus 
Dunham, L etai Minus 
Dunham, LetaL Minus 
Dunham, t etai Minus 
Dunham, LetaL Minus 



Minus 
Minus 
Minus 
Minus 
Minus 
Minus 



Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 



4657947-4657828 
46726564872564 
46779304677841 
5137007*136880 
5143942-5143806 
5144548*144344 
5223319*223088 
4637315-4637232 
5404643*404523 
5405980*405876 
5557628*557469 
5886643*886442 



7552160-7552084 
7750367-7750277 
7751850-7751777 
7911959-7911762 
8188855*188709 
8194390*194284 
8200268*200122 



8512805*512564 
8557051*556938 



10570714-10570572 
11755052-11754971 
11925963-11925834 
13265608-13265522 
13285293-13285178 
13289990-13289793 
13291759-13291569 
13454331-13454217 
13456310-13456209 
13461157-13461049 
13496857-13496717 
13675908-13675828 
13683722-13683596 
13728664-13728534 
13740004-13739812 
13742078-13741971 
14188289-14186163 
14195181-14195075 
14234033-14233932 
14389581-14389442 
14428355-14428281 
14455428-14454288 
14483789-14483700 
14487509-14487356 
14488605-14488526 
14834498-14834116 
15191678-15191609 
15371251-15371178 
15520047-15519887 
16049960-16049653 
16128678-16128528 
16132368-16132233 
16138424-16138319 
16148136-16148077 
16294548-16294360 
16307576-16307509 
16330748-16330681 
16413158-16413026 
16764338-16764249 
16857777-16857674 
17173957-17173760 
17464352-17464181 
17503891-17503768 
18488368-18488242 
19988711-19987853 
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Minus 20131162-20131054 

Minus 20147708-20147502 

Minus 20188176-20188020 

Minus 20294734-20294611 

Minus 20884109-20883951 

Minus 21059529-21059458 

Minus 21313841-21313598 

Minus 21320563-21320440 

Minus 21334136-21333811 

Minus 21335946-21335809 

Minus 21388250-21388146 

Minus 21388573-21388414 

Minus 21651593-21651522 

Minus 21656438-21656338 

Minus 21899517-21898678 

Minus 21915016-21914870 

Minus 21933519-21933365 

Minus 21950851-21950669 

Minus 22043431-22043262 

Minus 22063937-22063772 

Minus 22154036-22153937 

Minus 22168834-22168638 

Minus 22556589-22556422 

Minus 22556823-22556708 

Minus 22560390-22560136 

Minus 22641097-22640918 

Minus 22661861-22661271 

Minus 25070825-25070706 

Minus 25072328-25072142 

Minus 25358629-25358533 

Minus 25395274-25395152 

Minus 25402437-25402361 

Minus 25732501-25731972 

Minus 25757026-25756890 

Minus 25763806-25763747 

Minus 25818547-25819218 

Minus 25883733-25883572 

Minus 25885770-25885599 

Minus 25886469-25886334 

Minus 25958182-25958030 

Minus 25985373-25985280 

Minus 26323886-26323744 

Minus 26391707-26391530 

Minus 26420596-26420538 

Minus 26433427-28433344 

Minus 26436727-26436621 

R/Gnus 26662452-26662346 

Minus 28939225-26938782 

Minus 26943037-26942820 

Minus 26946988-26946901 

Minus 26949087-26948665 

Minus 26973898-26973747 

Minus 26975307-26975239 

Minus 26977639-26977558 

Minus 26980354-26980238 

Minus 27013352-27013273 

Minus 2744681027446378 

Minus 27653729-27653635 

Minus 2768231327682145 

Minus 27704276-27704144 

Minus 29036458-29036300 

Minus 29043828-29043727 

Minus 29050617-29050466 

Minus 29252077-29251969 

Minus 3013594840135854 

Minus 3016373040163810 

Minus 3024198840241839 

Minus 3081630640816195 

Minus 3142056941420509 
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336274 Dunham,LetaL Mints 

336318 Dunham, LetaL Minus 

336326 Dunham, LetaL Minus 

336339 Dunham, L etaL Minus 

336340 Dunham, I. etaL Minus 
336355 Dunham, LetaL Minus 

336392 Dunham, LetaL Minus 

336393 Dunham, L etal. Minus 

336394 Dunham, 1. etal. Minus 
336400 Dunham, L etaL Minus 
336402 Dunham,!. etal. Minus 
336413 Dunham, LetaL Minus 

336424 Dunham, I. etal. Minus 

336425 Dunham, LetaL Minus 
336437 Dunham, 1. etaL Minus 
336447 Dunham, LetaL Minus 
336449 Dunham, LetaL Minus 
336466 Dunham, LetaL Minus 
336492 Dunham, LetaL Minus 

336511 Dunham, LetaL Minus 

336512 Dunham, LetaL Minus 
336520 Dunham, LetaL Minus 
336522 Dunham, LetaL Minus 
336524 Dunham, LetaL Minus 
336527 Dunham, LetaL Minus 
336534 Dunham, I. etaL Minus 
336536 Dunham, LetaL Minus 
336542 Dunham, I. etal. Minus 

336556 Dunham, I. etal. Minus 

336557 Dunham, I. etal. Minus 

336558 Dunham, LetaL Minus 

336559 Dunham, I. etal. Minus 

336560 Dunham, I. etaL Minus 

336561 Dunham, LetaL Minus 
336597 Dunham, LetaL Minus 
336601 Dunham,!. etaL Minus 
336642 Dunham, LetaL Minus 
336645 Dunham, I. etaL Minus 
336662 Dunham, LetaL Minus 
336664 Dunham, LetaL Minus 
336676 Dunham, LetaL Minus 
336684 Dunham, L etal. Minus 
336688 Dunham. L etal. Minus 
336714 Dunham, LetaL Minus 
336719 Dunham, LetaL Minus 
336736 Dunham, LetaL Mows 
336744 Dunham, LetaL Minus 
338785 Dunham, LetaL Minus 
336793 Dunham, LetaL Minus 
336859 Dunham. LetaL Minus 
336863 Dunham, LetaL Minus 
336933 Dunham, LetaL Minus 
336942 Dunham, LetaL Minus 
336960 Dunham, LetaL Minus 
336969 Dunham, LetaL Minus 
336971 Dunham, LetaL Minus 
337003 Dunham, LetaL Minus 
337011 Dunham, LetaL Minus 
337070 Dunham, LetaL Minus 
337072 Dunham, LetaL Minus 
337086 Dunham, LetaL Minus 
337140 Dunham, LetaL Minus 
337183 Dunham, LetaL Minus 
337256 Dunham, LetaL Minus 
337278 Dunham, LetaL Minus 
337284 Dunham, Letat Minus 
337293 Dunham, LeLaL Minus 
337316 Dunham, LeLaL Minus 
337328 Dunham, LetaL Minus 



32085468-32085303 

33364452-33364338 

33567326-33567201 

33798479-33798330 

33812069-33811915 

33874750-33874649 

3401586844015736 

3401614544015951 

3401645744016298 

34023437-34023298 

3402409044 0 2 3 981 

3404670244046576 

3405554944055491 

3405854444058446 

3407415444074090 

3419820744197996 

3420470744204577 

3421318544213046 

3425557844255437 

34277480-34277351 

3427837344278275 

3431918444319101 

3432016944320056 

3432105544320921 

3432207144321966 

3432679744326620 

3432767844327538 

34331316-34331183 

3437524444374907 

3437544344375341 

3437582544375698 

3437643044376261 

3437681444376596 

3437716844376928 

7627912-7627757 

13265853*13265654 

1304281-1304212 

1351268-1351168 

2158060-2157993 

1893558-1993481 

2022565-2022497 

21580602157993 

2160698-2160486 

30940264093871 

33316314331503 

40931284093041 

4333001-43&848 

54199734419873 

56313454631237 

82017564201561 

83966734396425 

11760045-11759981 

12027537-12027455 

13267243-13267172 

13725722-13725643 

13732308-13732221 

15523541-15523422 

16106423-16106080 

19034423-19034321 

19077452-19077323 

19657011*19656881 

22849450-22649388 

24594969-24594874 

27659956-27659876 

28429017-28428848 

28491414-28491094 

28846334-28845873 

2965712949656997 

3001719940017069 
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10 



15 



337392 
337406 
337412 
337419 
337438 



337509 
337518 
337529 
337533 
337539 
337551 
337553 
337591 
337592 



337612 
20 337635 



25 



338010 
338012 
30 338017 



337850 
337854 
337913 
337915 



338129 



338157 
338195 



338278 
338431 
338448 
338451 
338477 
338534 



35 



40 



45 



50 



55 339034 
339180 
339212 



60 



65 339338 



Dunham, LetaL 
Dunham, LeLaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, Lata). 
Dunham, LetaL 
Dunham, LetaL 
Dunham, L etaL 
Dunham, LetaL 
Dunham, L etaL 
Dunham, L etaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, I. etaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, I. etaL 
Dunham, I. etaL 
Dunham, I. eta!. 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham,LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, I. etaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LeLaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LeLaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 
Dunham, LetaL 



Minus 
Minus 
Minus 
Minus 
Minus 



31233866-31233579 
31442311-31442229 
31884840-31884588 
31916487-31916312 
32021496-32021170 



Minus 



Minus 



Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Mtous 
Minus 



Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 



Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 



Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 



Minus 
Minus 
Minus 



32434517-32434425 

33414613-33414488 

33796750-33796647 

34043668-34043546 

34193388-34193261 

34254490-34254322 

34524446-34524382 

24230-24160 

1006414-1006184 

1007791-1007634 

1009460-1009291 

1355719-1355637 

1570235-1570142 

2169690-2169569 

4559540-4559266 

4567155-4567005 

5077143-5076943 

51534356153272 

6149843-6149786 

5922748-5922690 

7095797-7095680 

7754282-7754184 

7761421-7761351 

7884521-7864401 

7235048-7234950 

9595602-9595440 

10915338-10915237 

10989617-10989530 

11478551-11478355 

11731444-11731375 

13484103-13483972 

15242294-15242231 

16109555-16109398 

19747608-19747496 

20151152-20151054 

20174286-20174193 

20821897-20821838 

21771238-21771170 

24800712-24800461 

24827522-24827428 



25104153-25104016 
27664798-27664712 
27824238-27824079 
28491807-28491631 
28766345-28766253 
29071537-29071461 
30523414-30523289 
30621603-30621422 
32403103-32402985 
32494335-32494210 
32496590-32496440 
32504250-32504109 
32751331-32751238 
32934756-32934615 
32871258-32971090 
32974634-32974452 
32975943-32975806 



339400 Dunham, LetaL Minus 



339425 Dunham, LetaL 



Minus 
Plus 



34017306-34017205 
34045024-34044940 
34407911-34407798 
140049-14O17O 



266 



WO 02/30268 



PCT/US01/32045 



329568 3982490 
32S517 3983513 
325313 5866865 
325327 5866875 
5 325317 5B66878 
325257 5666895 
329632 6729060 
325371 5866920 
325375 5866920 
10 325378 5866920 

325469 6017034 

325470 6017034 
325576 6552443 
325505 6682451 

IS 325543 6682452 

329635 5302817 

329636 5302817 
325593 5856992 
$25675 5867014 

20 325704 5867028 
325682 6138923 
325785 6381957 
325668 6469822 
325818 6682490 

25 329777 6002090 
329768 6015501 
329759 6048280 
329731 6065783 
329687 6117858 

30 329676 6272128 
329667 6272129 

329669 6272129 

329670 6272129 
329641 6466233 

35 329791 6469354 
325826 5867048 
325829 5867052 
329888 6067149 
329893 6525313 

40 329899 6563505 
325988 5867064 
325855 5867067 
325999 5867073 
326001 5867073 

45 325886 5867087 
325882 5867087 
325905 5867104 
325922 5867122 
325937 5867132 

50 325960 5867147 
325961 5867147 

325838 6552452 

325839 6552452 

325840 6552452 
55 325844 6552453 

325870 6682492 
329984 4646193 
329976 4878063 
329935 6165200 

60 329916 6223624 
330021 6671889 
330024 6671908 
330028 6671908 
326033 5867178 

65 326036 5857178 
326056 5687184 
326118 5667193 
326122 5867194 
326138 5867203 



Plus 


36331-36750 


Minus 


53197*53269 


Minus 


27385-28192 


Plus 


75189-75264 


Minus 


156551-156649 


Plus 


10867-10955 


Plus 


192813-193017 


Minus 


1035422-1035536 


Minus 


1165503-1165810 


Minus 


1187881-1188167 


Plus 


286823-286991 


Plus 


287576-287663 


Minus 


137769-137894 


Minus 
Plus 


240852-240946 
151873-152057 


Minus 


62522-62622 


Minus 


6496945078 


Minus 


469726-469860 


Plus • 


955517-955711 


Plus 


156198-156387 


Plus 


370618-370763 


Plus 


6184942003 


Pius 


16769-16857 


Minus 


120278-120559 


Minus 


191389-191479 


Plus 


118315-118422 


Minus 


37647-37730 


Plus 


158772-158900 


Minus 


22165-22288 


Minus 


142207-142359 


Plus 


101355-101745 


Plus 


131223-131291 


Plus 


131351-131495 


Minus 


105995-106107 


Minus 


131982-132089 


Minus 


46361-46458 


Plus 


232674-233060 


Minus 


37227-37473 


Minus 


166123-166791 


Minus 


111058-111783 


Plus 


17349-17606 


Pius 


276141-276251 


Pius 


149115-149192 


Pius 


155223-155348 


Plus 


194694-194915 


Minus 


81766347 


Phis 


78779-78876 


Minus 


329063-329134 


Minus 


152633-152902 


Minus 


162506-162635 


Minus 


165106-165209 


Plus 


171451-171532 


Phis 


181864-182037 


Plus 


184380-184547 


Minus 


14188-14332 


Phis 


228209-228297 


Minus 


139780-139890 


Minus 


6258442691 


Minus 


6905949127 


Phis 


3639647195 


Phis 


120938-121032 


Minus 


1005-1270 


Minus 


3001540144 


Plus 


37261-37333 




120215-120273 


Minus 


181553-181690 


Plus 


45548-45604 


Plus 


144397-144683 


Minus 


178374-179436 
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326145 5867204 
326180 5867211 
326201 5867216 
326207 5867222 
5 326226 5B67230 
326233 5867232 
326238 5867260 
326241 5867260 
326243 5867261 

10 326251 5867263 
3262® 5857267 
326124 5916395 
326339 6056311 
330049 4567182 

15 326358 5867293 
326365 5867297 
326379 5867327 
326382 5867327 
326390 5867340 

20 326424 5867369 
326453 5867399 
326472 5867404 
326492 5867422 
326533 5867441 

25 330117 6015201 

330115 6015202 

330116 6015202 
330095 6015278 
330098 6015278 

30 326644 5B67559 
326713 5867595 
326745 5867611 

326752 5867615 

326753 5867616 
35 326598 5867634 

326657 6552455 
328855 6552460 
326812 6682504 
327005 5867664 

40 327003 5867664 
326896 5867680 
326904 5867684 
326951 6004446 
326941 6004446 

45 326943 6004446 
326928 6456782 

326958 6469836 

326959 6469836 
327039 6531965 

50 327127 6682520 
330158 6580387 
327204 5867447 
327208 SJ67447 
327268 5867462 

55 327277 5867473 
327289 5867481 
327296 5867492 
327237 5867544 
327145 5887548 

60 327333 5902477 
327335 5902477 
327343 6017017 
327350 6249563 
327358 6552411 

65 327360 6552411 
327409 5867750 
327424 5887751 
327430 5867754 
327470 5867772 



Minus 


52599-52814 


Minus 


182758-163222 


Minus 


166168-166959 


Plus 


4813948219 


Plus 


52644-52705 


Plus 


124788-124863 


Plus 


64282-64338 


Minus 


181648-181916 


Plus 


123838-123978 


Minus 


82716-82822 


Plus 


122114-122765 


Plus 


407102-407560 


Minus 


164637-165251 


Minus 


314682-315210 


Plus 


9122-0195 


Minus 


96630-96764 


Phis 


32299-32402 


Minus 


50420-50503 


Minus 


108814-110592 


Minus 


168329-168409 


Plus 


86222-86423 


Plus 


293739-293940 


Plus 


120768-120991 


Minus 


532153*32280 


Minus 


7340-7680 


Plus 


11403-11677 


Plus 


12109-12418 


Plus 


15343-15814 


Plus 


4937049458 


Phis 


4268442819 


Plus 


121511-121798 


Plus 


127130-127318 


Minus 


1214-1562 


Plus 


12454-12511 


Plus 


68955*9014 


Plus 


142311-142441 


Minus 


111390-111463 


Plus 


189811-189941 


Plus 


610847-610907 


Plus 


828737-928811 


Minus 


12032-12122 


Minus 


9280*606 


Plus 


193812-193998 


Plus 


62018*2896 


Minus 


89242*9427 


Minus 


291007-291219 


Minus 


4295243082 


Minus 


4315943301 


Phis 




Phis 


4192542083 


Plus 


81966*2456 


Phis 


165135-165239 


Phis 


180805-180864 


Minus 


82400*2615 


Minus 


165616-165715 


Plus 


4929649536 


Plus 


7627*166 


Minus 


59702*9813 


Minus 


4048240551 


Minus 


141448-141609 


Minus 


142979-143124 


Minus 


12288-12395 


Minus 


4189041985 
3802*950 


Minus 
Minus 


6255*422 


Minus 


52949*3011 


Plus 


160442-160598 


Plus 


1320-1403 


Plus 


150910-150973 
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327460 6004455 
327493 6017023 
327509 6117815 
£7510 6117815 
£7512 6117815 
£7535 6525279 
330163 6042042 
330171 6648220 
327579 5867824 
327672 5867843 
327629 5867872 
327640 5867890 
327649 5867899 
327612 6525283 
327718 6525284 
327801 5867824 

327762 5867961 

327763 5867961 
327776 5867864 

327822 5867968 

327823 5867968 
327807 5867868 
327845 6531962 
330228 6013527 
330160 6165182 
328122 5868031 
328132 5868038 
328159 5868065 
328168 5868071 
328175 5868073 
328217 5868096 
327665 5868130 
327866 5868131 
327870 5868131 
327879 5888142 
327902 5868168 
327818 5868165 
327834 5868184 
327859 5868210 
327976 5868212 
328020 5902482 
328042 5902482 
328008 5902482 
330301 2905862 
330299 2905881 
328274 5868219 
326595 5868224 
328591 5868227 
328668 5888254 
328677 5868256 
328687 5868262 
328706 5868270 
328711 5668271 
328730 5868289 
328732 5888289 
£8734 5868289 
£8752 5868298 
£8755 5868301 
328761 5868302 
328775 5888309 
328784 5868309 
328787 5868309 
328809 5868327 
328829 5868337 
328280 5868352 
328311 5868371 
328318 5868373 
328323 5868373 
328348 5868383 



Plus 175245-175343 

Minus 42178-42283 

Minus 54882-55053 

Minus 56824*6944 

Phis 176258-176325 

Plus 19105-19175 

Minus 20321-20335 

Phis 110889-111575 

Minus 37229-38335 

Minus 69649-69740 

Phis 49692-49811 

Phis 9448-9566 

Phis 205871-205927 

PIUS 2747-2924 

Phis 86123-86186 

Plus 23239-23348 

Minus 50303*0439 

Plus 229347-229476 

Minus 164308-164488 

Minus 168886-169633 

Minus 170359-170433 

Phis 33745-33811 

PiUS 193402-193549 

Minus .3719-3787 

Plus 36103-36243 

Phis 158474-158656 

Minus 126737-126839 

Minus 52957-53162 

PIUS 60£1*60479 

Plus 208-271 

Minus 3742-4362 

PIUS 61503-62205 

Minus 28935046 

Plus 53558-53757 

Minus 77722-77783 

Minus 133339-133467 

RUS 547530*47591 

PIUS 4183042036 

Minus 46497-46682 

Minus 348301-349409 

Minus 556386*56652 

Minus 1985085-1986626 

PiUS 296663-297151 

Mms 4420*761 

Minus 1020-1382 

Minus 31244-31439 

Plus 148738-148967 

Minus 237647-237726 

Minus 10888-10984 

Minus 58708-58950 

Phis 624479*24585 

Plus 165501-165614 

Minus 97797-97990 

Phis 8068*214 

Phis 37437*7550 

Phis 50*59*0747 

Minus 114911-115087 

Minus 145959-146446 

Minus 239308-239412 

Plus 12845-12920 

Minus 74523-74604 

Plus 135772-135963 

Plus 91792*1849 

Plus 36309*6630 

Plus 160563-160631 
Minus 170560-170826 

Plus 414945-415620 
Minus 1080089-1080235 

Minus 260272-260379 
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328377 


5868390 


Plus 


16947-17023 


328436 


5868417 


Plus 


203760-203904 


328504 


5868471 


Phis 


4706447217 


328506 


5868471 


Plus 


60716-60830 


328522 


5868477 


Plus 


1972307-1972452 


328525 


5868482 


Plus 


12387-14313 


328541 


5868486 


Plus 


130956-131050 


328662 


6004473 


Plus 


1184773-1184855 


326663 


6004473 


Pius 


1185279-1166634 


328803 


6004475 


Minus 


291716-291948 


328304 


6004478 


Minus 


368*3952 


328927 
328936 


5868500 
5868500 


Minus 
Minus 


428829*428893 
1352202-1352259 


328939 


6004481 


Minus 


131139-131320 


328941 


6456765 


Minus 


9817-9885 


328948 


6456765 


Pius 


28227-28413 


328968 


6456775 


Pius 


117442-118283 


330316 


6007576 


Minus 


119761-119931 


330350 


3056622 


Minus 


26413-26820 


330351 


3056622 


Minus 


27522-27614 


330348 


4544475 


Minus 


19855-19962 


329034 


5868561 


Minus 


32819-32939 


329046 


5868569 


Plus 


18971-19030 


329053 


6868574 


Plus 


426453426541 


329188 


5868711 


Minus 


13108-13225 


329237 


5868729 


Plus 


133238-133339 


329276 


5868762 


Minus 


222629-222709 


329333 


5868806 


Phis 


392666-392746 


32S376 


5868859 


Plus 


52356-52694 


329384 


6868869 


■ Minus 


116524-116662 


329140 


6017060 


Phis 


290842-290905 


329317 


6381976 


Plus 


614823*15209 


329319 


6381976 


Plus 


721390-721470 


329129 


6588026 


Pius 


144569-144712 


329373 


6682537 


Minus 


38950-39301 


329412 


6682553 


Minus 


68948*9041 


329424 


5868679 


Plus 


36218*362344 


329446 


5868886 


Phis 


84776*4899 


329449 


5868B86 


Plus 


97697-97771 
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TABLE 14: shows genes, including expression sequence tags, down-regulated in prostate 
tumor tissue compared to normal prostate tissue as analyzed using Affymetrix/Eos Hu02 
GeneChip array. Shown are the ratios of "average" normal prostate to "average" prostate 
cancer tissues. 



Ptey: 


Unique Eos probeset identifier number 




ExAccru 


Exemplar Accession number, Genbank accession numbe 




UnfoenefD: 


Unpens number 




UnigeneTBie: 


Unigene gene title 




R1: 


Background subtracted normal prostate : prostate tumor tissue 




Pkey ExAccn 


UnigenelD Unigene TBIe 


R1 


331328 AA281133 


H&88808 ESTs 


16.53 


320875 D60641 


Hs.131921 ESTs 


1455 


300994 AI251936 


Hs.146298 ESTs 


12.17 


323461 AA418762 


Hs.190044 ESTs 


1055 


301015 AA947682 


H&217173 ESTs; Weakly simflar to Chain A; Cdo42hs-Gdp Comptex [Ksapiens] 


10.17 


319419 AA543Q95 


Hs, 13648 ESTs; Highly similar to mtogen-lnduced [Mjnusculus] 


93 


323486 C05278 


Hs.166800 ESTs; Moderately similar to [PYRUVATE DEHYDROGENASE^ POAMIDEI] 






KINASE ISOZYME 4 PRECURSOR [Haptens] 


8.87 


324882 AW419080 


HS350645 ESTs 


8 


330569 U57796 


H&57679 zinc finger protein 192 


738 


330126 


CR21j2Ql|6093735 


73 


316265 AA737400 


Hs.142230 ESTS 


7.7 


323045 AA148950 


Hs.186836 ESTs 


7.64 


320668 R58399 


Hs.146217 ESTs 


7.4 


330769 AA465192 


Hs.16514 ESTs 


7.15 


312614 AT766732 


H&201194 ESTs 


7 


314790 AW341754 


Hs.189305 ESTs 


633 
6.74 


309979 AW452118 


H&257533 EST 


3114238 AA743396 


Hs.188023 ESTs 


6.49 


329192 


CHJLhsgi|5868716 


6.1 


324307 AA627642 


H&4994 transducer of ERBB2; 2 (TOB2) 


5.99 


303685 AW500106 


EST duster (not in UniGene) with exon hit 


532 


314921 AW452382 


H&257564 ESTs 


53 


315840 AA679001 


Hs.192221 ESTS 


538 


332776 AA034364 


H&256551 ESTs; Weakly simflar to till ALU CLASS B WARNING ENTRY Oil [H^aplens] 


5.43 


313533 AW298141 


Hs.157975 ESTs 


5.4 


303494 F30712 


EST cluster (not In UniGene) wim axon hit 


535 


317490 AI627358 


Hs.148367 ESTs 


531 


332546 D84454 


H&21899 solute caniar family 35 (UDP^alactose transporter); member 2 


535 


334719 


CH22J=GB1ES.421_30 


535 


300679 AA813958 


Ha307727 ESTs; Moderately simDarto WAA0071 [Rsaptens] 


532 


311811 AI625304 


Hs.190312 ESTs 


532 


315310 AW511298 


Hs356067 ESTs 


5.19 


312871 H86747 


H&2276Q2 WAA1 116 protein 


5.11 


324715 AI73916S 


EST duster (not In UniGene) 


437 


313870 AW206435 


Hs.146057 ESTs 


437 


321453 N50080 


HS.1178Z7 ESTS 


4.78 


316160 AW197887 


HS253353 ESTs 


4.63 


313833 AA766825 


EST duster (not in UniGene) 


438 


315850 AW270550 


Hs.116957 ESTs 


433 


303124 AF161350 


EST duster (not in UniGene) with exon hft 


4.46 


323346 AL134932 


Hs.143607 ESTs 


44 


301383 AA913591 


Hs.126480 ESTs 


435 


324513 AW501678 


Hs.164577 ESTs 


438 


303480 AA331906 


EST duster (not in UniGene) wim exon hit 


435 


323591 AA301270 


EST duster (not in UniGene) 


432 


313603 AW468119 


EST duster (not in UniGene) 


43 


317863 AT733395 


Hs.129124 ESTs 


4.1 


312381 R42049 


Hs.195473 ESTs 


438 


317514 AW451570 


H&126850 ESTs 


433 


319750 AAS21606 


Hs.117956 ESTS 


433 



271 



WO 02/30268 



PCT7US01/32045 



322620 T55958 EST duster (not In UniGene) 4 

314754 AW026761 Hs.134374 ESTa 4 

316088 AI990652 Hs308973 ESTs 4 

318473 Al 939339 Hs.146883 ESTs 3.95 

5 307848 AI36418B EST singleton (not In UnlGene) with exon hft 3.95 

300730 AW449204 H&257125 ESTs 354 

303034 W60843 H&31570 ESTs 333 

324668 AI679131 Hs301424 ESTs 35 

324674 AA541323 Hs.115831 ESTs 338 

10 300547 N53442 Hs.143443 ESTs a83 

316100 AW203986 Hs-2 13003 ESTs 3.79 

314601 AA481G27 Hs. 127336 ESTs; Weakly similar to ORF YGR245C [Sxerevisfeie] 3.75 

320856 D59945 EST duster (not in UniGene) 3.74 

313188 AIQ397Q2 Hs.179573 coOagen; type h, alpha 2 3.73 

15 314187 AA804409 Hs.1 18920 ESTs 3J3 

311826 AA765470 Hs. 122826 ESTs 3.7 

302358 D81150 EST duster (not In UniGene} with exon hit 3.68 

311441 Z38720 Hs.151014 ESTs 3.66 

321914 AA011603 EST cluster (not In UniGane) 3^9 

20 332216 H95082 Ks. 102332 EST 352 

324771 AA831739 EST cluster (not in UniGene) 33 

323691 AA317561 EST duster (not in UniGene) 3.49 

303525 AW516519 Hs.115130 ESTs 347 

309709 AW242630 EST singleton (not tn UniGene) with exon hit 346 

25 300038 AFFX control: MurlU 338 

316526 AI088192 Hs.135474 ESTs; Weakly slmaar to ATP-OEPENDENT RNA HEUCASE A fUsapisns] 336 

313029 AA731520 Hs.170504 ESTs 335 

304356 AA196027 Hs. 195188 glyceraldehyde^phosphate dehydrogenase 334 

314610 AI948686 Hs.191805 ESTs 333 

30 329815 CH.14_p2gll6624888 332 

314949 A1745387 Hs339124 ESTs 331 

300598 N53574 Hs. 158932 ESTs 33 

329218 CHXJlS0j|5868728 328 

315706 AW440742 Hs. 155556 ESTs 338 

35 303751 AW503637 EST duster (not in UniGene) wflh exon hft 335 

307783 AI347274 EST singleton (not In UniGene) with exon h& 335 

321414 AA324975 Hs.128993 ESTs; Weaidy similar to K1AA0465 protein [H^apiens] 335 

312187 AA700439 Hs.188490 ESTs 335 

334061 CH22LFGBIES327J4 333 

40 336036 CH22LFGENES378.7 333 

321477 H67818 Hs322059 ESTs 331 

315760 AW139383 HS345437 ESTs 33 

316733 AA811713 Hs.163222 ESTs 33 

300855 AW235248 Hs.79828 ESTs 33 

45 323611 AA304986 Hs.145704 ESTs 3.19 

314138 AA740616 EST duster (not to UniGene) 3.17 

316774 AA814859 EST duster (not in UniGene) 3.16 

308884 A1833131 Hs.179100 ESTs 3.11 

331317 AA258222 Hs.87757 ESTs 3.1 

50 317221 AJ989538 Hs.191074 ESTs 3.08 

316386 AA749062 Hs.180285 ESTs 3.08 

321040 H26953 EST duster (not in UniGene) 338 

308828 A1824829 EST singleton (not in UniGene) with exon nit * 3.08 

300778 AA236233 Hs.188716 ESTs 3X17 

55 316687 AW015940 Hs332234 ESTs a07 

324614 AW503101 EST cluster (not in UniGene) 3.07 

316468 AW293046 Hs355158 ESTs 3.07 

300671 AI239706 Hs.189886 ESTs 3.06 

314301 AW297987 Hs.188181 ESTs 335 

60 312335 AW043620 Hs336993 ESTs 3.03 

322957 AA247755 EST cluster (not In UniGene) 331 

316848 AA830053 Hs.126798 ESTs 331 

313473 AA009660 Hs351948 ESTs; Moderately sfrnfer to T07D3.7 [Celegans] 239 

318516 T27119 EST duster (not in UniGene) 238 

65 313383 AI076370 Hs.134037 ESTs 237 

331389 AA458637 Hs. 152207 ESTs 236 

304257 AAD53294 EST stngteton (not tn UniGene) with exon hit 235 

309917 AW340014 EST singleton (not In UniGene) wfih exon hft 235 

319661 H08035 Hs31398 ESTs; Moderately similar to PUTATIVE 6LUC0SAMIN&6W0SPHATE 
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321253 A16994S4 
321193 AA149508 



ISOMERASE [H sapiens] 
EST duster (not In UnlQene) 
Ks. 103288 ESTs 

L4 



300027 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



320014 
333916 
318685 
318146 



305703 



317672 
323416 
312652 
324094 
319761 
317013 
317383 
314659 
312479 



311624 
321992 
316074 



312071 
312684 



322139 
304168 
325602 
319885 
300611 
316854 
318208 
331623 
324616 



M11507 

AA884766 

AA137114 

Z43272 
A1040125 
AA233058 
AA825148 

AW205409 

AI610397 

AI419909 

AA382603 

RB4237 

AA864468 

AA913887 

AWZ77121 

AI950844 

AW293826 

C06003 

AW517542 

AW296076 

AA683529 

AW294020 

AA062971 

H53744 

K77679 



N75450 
AA831215 
AI091458 
R38715 



314912 
300767 
313463 
320600 
301180 



AA614308 
AI431345 
A W1 93466 
A1057369 
AA135565 



AA704457 
AW282417 



317850 N29974 
339047 

324580 AA492588 

321142 AI817933 

319478 R06841 

300793 AI248571 

313733 AA836116 



AW015506 
AF090948 
H24244 
AI209108 



AA324437 

AW157377 

AW136134 

A1479011 

AI743261 

AW293174 



314987 
303114 
318709 
312878 
329224 
328018 
323231 
312887 
315183 



AFFX control: transferrin receptor 

EST cluster (not in UnlQene) 
Hs.170291 ESTs 

CH22_FGENES.296_5 

EST cluster (not In UniGene) 
Hs.160521 ESTs 
Hs.191518 ESTs 
Hs£1229 Hwx protein Fbwlb 

CH22_FGENES.629_7 
Hs.127748 ESTs 
Hs.159560 ESTs 
Hs, 160994 ESTs 

EST cluster (not In UniGene) 

EST cluster (not in UniGene) 
Hs.135646 ESTs 
Hs.126511 ESTs 
HS254881 ESTs 

Hs.128738 ESTs; Weakly similar to non-tens beta gamma-crystaJUn tike protein [H^aplens] 

CH22_FGBiES.7J0 
H&250610 ESTs 
Hs.1 16456 ESTs 
Hs£08382 ESTs 

EST singleton (not In UniGene) with exon hft 
Hs.143119 ESTs 
Hs.1 17721 ESTs 

Hs.181161 ESTs; Weakly slmflar to INHIBITOR OF APOPTOSIS PROTEIN 1 [Mjnusculus] 

EST cluster (not in UniGene) 

EST singleton (not in UniGene) with exon hft 

CH.13_hsgl]5866994 
Hs.136698 ESTs 

EST duster (not In UniGene) with exon hit 
Hs.1 59065 ESTs; Weakly similar to predicted using Genefiruier [Celegans] 
Hs.1 34559 ESTs 

Hs.153529 Homo sapiens done 24540 mRNA sequenoe 
Hs.162000 ESTs 

EST singleton (not in UniGene) with exon hit 
Hs.161784 ESTs 
Hs.136525 ESTs 
Hs.122536 ESTs 
H&250739 ESTs 
Hs.156939 ESTs 

Hs255738 ESTs; Moderately similar to gag [Rsapiens] 

Hs£55074 ESTs; Moderately simBar to high-risk human papilloma viruses E6 

oncoproteins targeted protein E6TP1 alpha [Haptens] 

EST cluster (not In UniGene) 

CH2*J)A59H18.GENSCAN.28-7 

EST duster (not in UniGene) 
HsSQ9584 ESTs 

EST duster (not In UniGene) 
Hs.186837 ESTs 

EST duster (not in UniGene) 

CK19Jwgll5867435 
Hs.130730 ESTs 

EST duster (not in UniGene) with exon hit 
H&240763 ESTs; Weakly similar to /prediction 
Hs.143946 ESTs 

CHXJ»g?5868728 



313240 
316697 



Hs.177230 ESTs 
Hs.132910 ESTs 
Hs.230277 ESTs 
Hs.170783 ESTs 
Hs.131860 ESTs 
H&252627 ESTs 



235 
233 
233 
232 

£91 

238 

238 

238 

237 

237 

235 

234 

233 

232 

231 

231 

231 

23 

23 

2.78 

2.78 

2.77 

2.75 

2.75 

2.73 

2.73 

2.73 

2.73 

2.72 

2.72 

2.72 

2.72 

2.71 

2.71 

2.71 

239 

238 

238 

238 . 

237 

237 

237 

235 

235 

235 

235 

234 

234 

234 

233 

232 

232 

231 

23 

23 

23 

239 

238 

237 

236 

236 

235 

235 

235 

234 

234 

233 
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313966 AI807551 Hs.169061 ESTs , 2.53 

331263 AA015718 ze31a12^1 Scares retina N254HR Homo sapiens cDNA clone 

IMAGE385743',mRNA sequence . 2J51 

310683 AW055233 Hs.160870 ESTs 25 

5 302566 AA0B5996 Hi248572 Human PAC done DJ404F1 8 from Xq23 25 

302697 AJ0014O8 EST cluster (not fn UnlGene) with axon hit 25 

308362 AI613519 EST singleton (not In UnlQane) with exon hit 2.49 

322347 AF086538 EST cluster (not In UniQene) 2.49 

316240 AA9742S3 Ks.120319 ESTs 2.49 

10 323208 AA203415 Hs. 136200 ESTs £48 

321643 W76005 Hs52094 ESTs 2.48 

330723 AA243617 Hs51082 ESTs; Highly similar to db83 [Rnorvegfcus] 2.46 

323455 AA256675 K&200438 ESTs; Weakly slmfiar to atypical PKC specific binding protein [Rnorvegicus] 2.47 

308383 AI824497 EST singleton (not In UnlGene) with exon hft 2.47 

15 328744 CH.07Jwgi|5868290 2.47 

332344 W45574 Hs£52497 ESTs 2.47 

328121 CH.06Jwgi|586B031 2A7 

321915 AB70955 Hs.200151 ESTs 2.46 

314954 AA521381 Hs.187726 ESTs 2.45 

20 302821 AA168868 Hs.173933 ESTs; Weakly similar to NUCLEAR FACTOR 1/X [H^apians] 2.45 

329454 CH.YJ«gi|5B68887 2.45 

835505 CH2£J=GENES.420_4 2.45 

300664 AI444628 Hs.256809 ESTs 2.44 

323362 AL135067 Hs.117182 ESTs 2.44 

25 300024 M10093 AFFX control: 18S ribosomal RNA 2.44 

325026 AI571168 Hs.12285 ESTs 243 

324510 AI143353 Hs.120849 ESTs 2.43 

313389 AI765182 Hs.119903 ESTs 2.43 

301309 M78276 Hs.255917 ESTs 2.43 

30 313570 AA041455 Hs£09312 ESTs 243 

316504 A W1 35854 Hs.132458 ESTs 242 

319401 R01342 EST cluster (not in UnlGene) ' 2.42 

312827 AI744361 Hs£Q5591 ESTs; Weakly similar to zinc finger protein Png-1 [W.rrtusculus] 2.42 

327871 CR06_hsglj5868131 241 

35 337173 CH2JLR5ENES.565-3 2.41 

302948 AA465635 EST cluster (not In UniQene) with exon hft 2.41 

324303 AL118754 EST cluster (not In UniQene) 2.4 

315527 AI791138 Hs.116768 ESTs 24 

315979 AA830515 Hs222917 ESTs 24 

40 331310 AA&3351 Hs.44439 STAT toduced STAT inhibitor-4 24 

321095 AA017595 Hs52844 ESTs 24 

308561 AI701559 EST singleton (not in UniQene) wifli exon hit 2J39 

313035 N36417 Hs.144928 ESTs 237 

322114 AA643791 Ks.191740 ESTs 237 

45 313871 W49823 Hs.145553 ESTs 257 

303211 AA099548 Hs.191436 ESTs; Highly similar to dJI 1 18D244 {Usapfens] 257 

301256 AA932948 EST cluster (not In UnlGene) with exon hit 256 

. 338165 CH22_EMiACO05500.GENSCAN212~3 256 

324692 AA557952 EST cluster (not In UniQene) 255 

50 318587 AA779704 Hs.168830 ESTs 255 

312378 R41582 Hs.109219 retinal degeneration B beta 255 

318625 T48448 Hs.193162 ESTs 255 

305181 AA663726 Hs. 116922 EST 255 

300815 AA286678 EST duster (not in UniQene) with exon hit 254 

55 324063 AW292740 HS554815 ESTs 254 

315859 AA682305 Hs. 133268 ESTs 253 

305092 AA642912 EST singleton (not In UniQene) with exon hit 253 

306598 AJ000320 EST singleton (not fo UnlGene) with exon hit 253 

300307 AI651016 Hs546311 ESTs 253 

60 321348 Z49979 EST duster (not In UnlGene) 253 

325112 AI90377O Hs.124344 ESTs 252 

336679 CH2£J=GENES437 252 

321383 AJ002574 EST duster (not In UnlGene) 252 

337357 CH22_FGENES.730* 251 

65 300680 AVW68066 H&257712 ESTs; WeaWy similar to K1AA0986 protein [Rsapians] 251 

327120 CR21_hsgiiS531970 251 

302761 AW25Q553 EST duster (not In UnlGene) wfti exon hft 25 

312132 AW75490 Hs.170577 ESTs 25 

315639 AA827652 EST duster (not In UrdGene) 25 
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312189 


T95594 


Hs.1 87435 ESTs 


23 


306537 


AA991705 


EST singleton (not In UmGena) wan exon nit 


A A 


327061 




CH21_nsgi[6531965 


Z3 


315391 


AA759098 


Hs.192007 ESTs 


Z3 


322384 


AI968646 


H&33862 ESTs 


229 


323206 


AA203339 


H&220750 ESTs 


229 


318110 


AI680915 


Hs201379 ESTs 


228 


335250 




CH22_FGENES.516J 1 


228 


331696 


Z38907 


Hs.91662 K1AA0888 protein 


A QO 

228 


318327 


AW294013 


Hs200942 ESTs 


228 


324980 


AA969121 


Hs254298 ESTs 


O AO 

228 


319429 


AI608881 


Hs.1 1482 ESTs; Highly similar to JuncronaJ adhesion molecule [H .sapiens] 


AAA 

229 


310601 


A1970543 


HS.1 92605 ESTs 


228 


318905 


Z43395 


EST duster (not in UniGene) 


228 


323442 


AA252753 


Hs.1 64039 ESTs 


227 


304428 


AA342250 


H&99819 ublquffin specific protease 16 


227 


313352 


AW2921Z7 


Hs.144758 ESTs 


227 


316491 


AA766025 


HS238794 EST 


227 


317751 


AI697668 


Hs202241 ESTs 


226 


314138 


AA229781 


H&221962 ESTs 


226 


306665 


AI004614 


Hs.1 30577 EST 


226 


303946 


AW474196 


H&221604 ESTs 


225 


313435 


AA769123 


EST cluster (not in UniGene) 


225 


317679 


AA988799 


Hs.150289.ESTs 


225 


322370 


AA330095 


EST duster (not in UniGene) 


225 


306620 


AI000929 


EST singleton (not In UniGene) with exon hit 


224 


329109 




CKXJisgl|5868626 


224 


311043 


AI871209 


Hs.177128 ESTs 


224 


300228 


AI458372 


Hs.1 58748 ESTs; Weakly similar to synapsln lb [Mmusculus] 


224 


307223 


AI193698 


Hs.1 84776 rtbosomal protein L23a 


224 


309023 


A1888045 


EST singleton (not in UniGene) with exon h& 


223 


310749 


AI493675 


Hs.170332 ESTs 


223 


316769 


W914939 


Hs2121B4 ESTs 


222 


320409 


AA356195 


EST duster (not in UniGene) 


221 


333149 




CH22.FGB4ES.87_B 


221 


324951 


M86125 


Hs.137487 ESTs 


221 


321939 


AI791617 


Hs.145068 ESTs 


22 


3gQ594 


Ai 863952 


Hs.169436 anglnyttransf erase 1 


22 


320722 


R67430 


Hs.172787 ESTs 


22 


321781 


D78667 


EST duster (not in UniGene) 


22 


328903 




Ca08Jisgi|5868514 


22 


303889 


T18204 


EST duster (not in UniGene) wffli exon hit 


22 


325045 


T08845 


EST cluster (not in UniGene) 


22 


312628 


AI865455 


HS211818 ESTs; Moderately similar to 1111 ALU SUBFAMILY J WARNING ENTRY Ml [Haptens] 2.1 


335109 




CH2^FGENES.494_15 


2.18 


330878 


AA131471 


Hs.71440 ESTs 


2.18 


311289 


AI971362 


Hs231945 ESTs 


2.18 


304608 


AA513456 


EST singleton (not in UniGene) wfih exon hit 


2.18 


337393 




CH2a.FGBIES.747-4 • 


2.18 


332812 




CH22JGENES.7J4 


2.18 


327665 




CK04_hsgI)5887839 


2.18 


314581 


AW504859 


HS237849 ESTs 


2.17 


326508 




CH.19Jisgi|6682496 


2.17 


301242 


AW1 61535 


H&258803 ESTs 


2.17 


312780 


AI785651 


Hs.172900 ESTs 


2.17 


315954 


AW276810 


H&254859 ESTs 


2.16 


311179 


AI880843 


H&223333 ESTs 


2.16 


315320 


AI084182 


Hs.186885 ESTs 


2.16 


313017 


AI015203 


Hs.118015 ESTs 


2.16 


312430 


AW139117 


Hs.117494 ESTs 


2.15 


300864 


AA406539 


Hs.190958 ESTs 


2.15 


314753 


AA463262 


EST duster (not in UniGene) 


2.15 


322574 


AF156548 


EST duster (not in UniGene) 


2.15 


321409 


C03864 


EST cluster (not in UniGene) 


2.15 


321205 


AA002047 


EST duster (not in UniGene), 


2.14 


320406 


AA353895 


Hs.152983 HUS1 (S. pombe) checkpoint homolog 


2.14 


337646 




CH22_EMJ«X00097.GENSCAN.1 1-2 


2.13 


303084 


AF174008 


EST cluster (not In UniGene) with exon ha 


2.13 


312185 


AA654772 


Hs.186564 ESTs 


2.13 
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306813 


AI066544 


314465 


AAEQ2917 


318168 


AI821782 


315990 


AI800041 


320712 


R66887 


318487 


AI167877 


317462 


AW015206 


304384 


AA235482 


314544 


AA399018 


319881 


T72744 


328078 
317354 


AW09Q770 


308617 


AI738720 


311568 


AW439969 


313605 


AI761786 


314289 
332933 


AAB48118 


325498 




313659 


AW286067 


324596 


AW149321 


324783 


AA540770 


302696 


AA347452 


313418 


AW450674 


326920 
327574 




323207 


AIO52705 


303753 


AW503733 


305235 


AA670480 


316055 


AA693880 


317194 


AW445167 


319565 


AW408683 






301475 


AI678183 


312442 


M120970 


322502 


R62925 


303693 


AA290875 


310179 


AI215643 


391191 


W23285 


331330 


AA282197 


306557 


AA994530 


317865 


A1298794 


318667 


A1493742 


318042 


AW294522 


323818 


AW245528 


331288 


AA137062 


311262 


A1989942 


335601 
311351 


AIS82303 


312996 
328190 


AA249018 


333940 




328227 
331481 


N27448 


335288 




307513 


A1274307 


323316 


AL134620 


319479 


R21945 


303482 


AA5Q2583 


327489 




323935 


AW175841 


309575 


AW168096 


337043 




312897 


AI828174 


307881 


AJ370434 


328656 
314569 


AA813784 


332783 


W45302 


315259 


AA701499 



EST singleton (not in UniGene) wQh exon hft 2.13 

Hs.156974 ESTs. 2.12 
HS220587 ESTs; Moderately similar to till ALU SUBFAMILY SC WARNING ENTRY [H^aplens] 

Hs.180555 ESTs 2.11 

EST cluster (not in UnlGena) 2.1 1 

Hs.143716 ESTs 2.11 

Hs.178784 ESTs 2.11 

Hs.62954 ferritin; heavy polypepfldel 2.11 

Hs250835 ESTs 2.1 

EST duster (not to UniGene) 2.1 

Ca06_hsgp68008 2.1 

Hs.102271 ESTs 2.1 

EST singleton (not In UniGene) with exon ha 2.09 

H&218177 ESTs 2.09 

H&204674 ESTs 2X9 

H&221216 ESTs 2X8 

CH2£_FGENES.38_7 2X8 

CH.12J*g]]5866967 2X8 

Hs.124108 ESTs 2X8 

Hs.105411 ESTs 2X8 

EST duster (not In UniGene) 2X7 

EST cluster (not in UniGene) wfflt exon hit 2X7 

Ks.114695 ESTs 2X6 

CR21Jwgi|64567B2 2X6 

CHX3Jisgi|58S7818 2X6 

Hs.182201 ESTs 2X6 

Hs.170315 ESTs 2X5 

EST singleton (not in UniGene) with exon hft 2X5 

EST cluster (not In UniGene) 2X5 

H&126036 ESTs 2X5 

Hs32922 ESTs 2X5 

CH22_FGENES.499^ 2X5 

K&170917 prostaglandin E receptor3 (subtype EP3) 2X4 

Hs.143199 ESTs 2X4 

Hs£43665 ESTs 2X4 

HS30120 ESTs 2X4 

Hs.171381 ESTs 2X3 

EST duster (not in UniGene) 2X3 

H&890Q2 ESTs; Highly similar to CGl-07 protein [H-sapiens] 2.03 

EST singleton (not in UniGene) wfth exon hft 2X3 

H&129130 ESTs 2X3 

Hs.165210 ESTs 2.02 

Hs.149991 ESTs 2X2 

Hs.134754 ESTs 2.02 

Hs.103853 ESTs 2.01 

H&232150 ESTs 2X1 

CH22_FGENES£81_41 2X1 

HS201274 ESTs 2.01 

EST cluster (not In UniGene) 2X1 

CHXSJis gi|5868077 2 

CH2^EMAO005500,GBISCAN.148-16 2 

CH2£J=GBIES.301ja 2 

CHX6Jisg!|5868105 - 2 

H&43944 EST 2 

CH22J=GBIES.527J 2 

EST singleton (not in UniGene) with exon hit 2 

EST cluster (not In UniGene) 2 

H&256153 ESTs 2 

Hs.197271 ESTs 2 

CHX2J1SOJI60Q4459 1X9 

Hs.192183 ESTs 1X9 

Hs.195188 glyceraio^de-3-phosphate dehydrogenase 189 

CK23JGENE&439-19 1X8 

HS227049 ESTs 1X8 

EST singleton (not In UniGene) wfth exon hit 1X8 

CHX7JIS gil6004473 1X8 

Hs.123001 ESTs 1X8 

HSJ87889 heUcase-moi 1X8 

Hs.148115 ESTs 1X8 
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313171 N67879 Hs.157695 ESTs 157 

318060 AK41421 Hs.132238 ESTs 157 

332256 N88393 Hs.102754 ESTs 157 

312110 AI962180 H&226803 ESTs 157 

335854 CH22.FGENES.629J 157 

320389 W00545 Hs.171785 ESTs 1.97 

314065 AA868267 Hs55524 ESTs 156 

323086 H15474 Hs.12214 Homo sapiens clone 23716 mRNA sequence 156 

323919 AA862973 H&2207O4 ESTs 156 

310750 A1373163 Hs.170333 ESTs 156 

309435 AW090537 EST singleton (not In UnlQene) with exon hft 156 

300129 AW028820 EST duster (not In UniGene) wfih exon hit 156 

320130 AI 820675 H&203804 ESTs 155 

323787 AW373446 Hs.169885 ESTs; WeaWy similar to cDNA EST EMBLT02216 comes from fills gene [Cetegans] 155 

338112 <mBAAC005mGBJSCAN.185.24 155 

313625 AW468402 Hs.254020 ESTs 155 

325240 CH.10j»gi|5866848 155 

331833 AA412102 Hs.250911 intertsuWn 13 receptor; alpha 1 155 

332252 N63882 za21©.8l Soares fetalBver spleen 1NFLS Homo sapiens cDNActone 

IMAGE293225 3*, mRNA sequence 155 

300279 AW237425 H&253817 ESTs 155 

326023 CK17Jisgi|5867245 155 

321609 H66021 Hs.1 98800 ESTs; Weakly similar to hMmTRAlb [Hxaplens] 154 

324183 AA4Q2453 Hs.113011 ESTs 154 

336278 CH22_FQENES.762_5 1.94 

334913 CH22J=GENES.456_3 1.94 

325417 CR12Jlsgi|5866925 1.94 

318489 AW043590 H&225023 ESTs 1.94 

318455 AI148763 EST duster (not in UniGene) 154 

306890 AI092235 EST singleton (not In UnlGene) with exon hft 154 

315073 AW452948 H&257631 ESTs 154 

321289 R84687 H&226306 ESTs 1-94 

308521 AI689808 EST singleton (not In UnlGene) with exon hit 1.93 

306382 AA968967 EST singleton (not In UnlGene) with exon hit 1.93 

331320 AA262999 Hs.42788 ESTs 1.83 

324279 AA501412 Hs.191688 ESTs; WeaWy simflar to Pro-PoWUTPase poiyprotein [M.muscutusl 153 

309577 AW168753 EST singleton (not h UniGene) wfth exon hit 153 

327014 CH21,hsgIl5867664 153 

303488 AWQ25860 EST duster (not In UniGene) with axon hit 153 

306561 AA995223 Hs.129559 EST 1.92 

330694 AA019806 Hs.1 08447 spinocerebeBar ataxia 7 (ctoporrtocerebeQar etrophy with retinal degeneration) 152 

313083 N50545 Hs.159200 ESTs 152 

327752 CHJ05Jisgi|5867949 152 

318674 AA295490 EST duster (not in UniGene) 152 

301287 AW297762 Hs255690 ESTs 151 

332092 AA6087B7 Hs.1 12590 ESTs 151 

323509 ALD36947 EST duster (not In UnlGene) 151 

321452 AA317554 EST duster (not In UniGene) 151 

311483 AI765013 Hs209128 ESTs 151 

300976 AI246374 Hs.165861 ESTs 151 

323715 AA322155 EST duster (not in UniGene) 151 

313800 AW2S6132 Hs.1 66674 ESTs 151 

332029 AA489697 Hs.145053 ESTs * 151 

304013 AW51B573 Hs.1561 10 trrununogtobuBn kappa variable 1D-8 151 

322019 AA354549 H&41181 Homo sapiens mRNA; cDNADKFZp727C191 (from done DKFZp727C191) 151 

334150 CH22JGBIES539J 15 

310094 AW450967 Hs.235240 ESTs 15 

316216 AVU207642 Hs.174021 ESTs 15 

324774 AI031771 Hs.132588 ESTs 15 

326507 CH.19jtsgI|5867435 15 

314570 AA405696 EST cluster (not to UnlGene) 15 

836268 CH22LFGENES.758.2 15 

315278 AJ985544 Hs. 11 6429 ESTs 15 

325824 CH.15Jisgil5867048 15 

316277 AA737780 H&213392 ESTs 15 

323181 AA418583 Hs.143621 ESTs 15 

301438 AA961643 Hs.127716 ESTs 159 

307050 AI147341 Hs.146734 EST 159 

306830 A1075803 EST singteton (not In UniGene) with axon hit 159 
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302426 AUM9925 H&225984 DKFZP547G0910 protein 139 

320127 H72615 Hs.17268 ESTs 139 

337738 CH22_BAAC000097.GENSCAN.10(W 139 

331319 AA262755 Hs.194264 ESTs 138 

310767 AI377505 Hs.1 53835 ESTs 138 

314880 AI732169 Hs.105429 ESTs 138 

312539 AI004377 H&200360 ESTs 138 

309874 AW205604 Hs.1 68034 ESTs; WeaWysimferto AD ALU SUBFAMILY SP WARNING ENTRY 110 [H^apians] 138 

314621 AI827478 H&187670 ESTs 138 

319495 AI972148 Hs.182756 ESTs 138 

313472 AA007374 EST cluster (not In UnlGene) 138 

302705 U09060 EST clustar (not In UnlGene) with exonhit 138 

329511 CH.10_p2gi|3983514 138 

317140 AI699412 H&201925 ESTs 137 

302598 A1815985 Hs.1 29683 ublquffin-conjugating enzyme E20 1 (homologous to yeast UBC4/5) 137 

301153 AA725670 Hs.120485 ESTs; Weakly simflar to serinsVthreonine kinase with SH3 domain; leucine 

zipper domain and proline rich domain [H .sapiens] 1 37 

332222 N28271 Hs.1 7661 8 ESTs 137 

330703 AA055475 Hs.1 04143 cialhrin;Cght polypeptide (Lea) 137 

318470 AI159863 Hs. 14371 3 ESTs 137 

314014 AW291847 Hs.121715 ESTs; WeaWy simBar to HP protein [KLsaplens] 137 

300370 AI827817 EST duster (not In UnlGene) with exon h3 138 

312329 R84768 Hs.13399 Homo sapiens clone 25032 mRNA sequence 136 

325587 CH.12Jis gT]6682462 136 

310237 AI884313 Hs.158906 ESTs 136 

318872 R13085 EST duster (not tn UnKaene) 136 

303431 AA317915 EST duster (not in UniGene) with exon hit 136 

338427 CH22_EfytACOO5500.GENSCAfi349-1 136 

300452 AI352293 Hs.191093 ESTs 135 

. 321279 H85330 Hs.146060 ESTs 135 

301690 F05865 H&549180 ubiquffin-oonjugating enzyme E2E 2 (homologous to yeast UBC4/5) 1 35 

307932 AJ230822 EST singleton (not In UnlGene) with exon hft 1.85 

318292 A1679966 Hs.150603 ESTs 135 

310254 A1239811 Hs.157491 ESTs 135 

311790 AW0164S7 HS233462 ESTs 134 

314248 AA278347 Hs.126078 ESTs 134 

335588 CH2a_FGENES381_25 134 

339209 CH22_FF113D11.GENSCAN.64 134 

307954 A!419692 EST singleton (not in UnlGene) with exon hit 134 

302549 AF055138 Ha248162 tectorin alpha 134 

321829 H87213 Hs.158092 ESTs 134 

301239 AA807558 EST cluster (not in UnlGene) with exon ha 134 

332434 N75542 Hs.75356 transcrfpfcn factor 4 134 

327192 CR01_hsgip887445 133 

310214 AI220072 Ha.165893 ESTs 133 

320516 R33857 Hs.161479 ESTs; Weakly similar to E-SELECT1N PRECURSOR [H.sapiens] 133 

324231 W60827 EST duster (not tn UnlGene) 133 

336616 - CH22_FGENES.613_5 133 

328799 CH.07_hs gij5B68316 133 

324661 AW504161 EST duster (not in UnlGene) 133 

313190 AA766707 Hs.153039 ESTs 133 

301979 L28168 Hs.1 21 495 potassium voltage-gated channel; Isk-related family; member 1 132 

302099 AL021397 Hs.137576 ribosomal protein L34 pseudogene 1 * 132 

320187 T99949 EST duster (not tn UniGene) 132 

320791 R78808 Hs33961 ESTs; Weakly similar to Oil ALU CLASS A WARNING ENTRY !i!J [H^aplens] 132 

305733 AA829535 K&34298 CD74 antigen (invariant potypept of MHC; dass II antlgen-assodated) 132 

308280 AI589349 Ks.160920 naosomal protein S9 131 

321533 W78877 Hs^0111 ESTs 131 

312946 AI915122 H&204087 ESTs; Weakly similar to F33D1 13b [Oetegans] 131 

319474 H9Q265 Hs.100638 ESTs 131 

329519 CH.10JJ2 gip983510 131 

324685 AA220982 EST duster (not in UniGene) 131 

320697 N62937 Hs.139181 ESTs 131 

329246 CRXJegil5868732 131 

332000 AA481271 Hs.193945 ESTs 131 

310811 AI420990 Hs.161303 ESTs 131 

325866 CH.16„hsgI|5867076 131 

322064 Z78343 EST cluster (not tn UniGene) 13 

333712 CH22LFGENES-2>1_1 U 
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313457 AA576052 
321591 H85687 



Hs.193223 
Hs.117927 



311080 AI558320 H&107711 



HS211417 
H&204877 
Hs.244624 
Hs.196115 

Hs.188634 
Hs.148958 



Hs.187902 
Hs.186387 
Hs.118348 
Hs.125382 



Hs*07514 

Hs.21169 
Hs.117904 

Hs.226469 
H&31783 



322889 AA081924 

300175 A1275011 

330976 H20560 

300208 A1341180 

319635 R17531 

313454 AA730673 

303093 AI400310 

309815 AW^2760 



AAS49011 
AI523739 



319845 
300290 
312180 
313058 
330120 
328412 
302345 
308100 
311386 
330282 
318856 



321206 
330977 
303487 
310398 
313230 
317747 
303381 
338123 
300185 
316002 
319850 
329941 



081015 



MAJ000565 

AJ475949 

AW205705 

Z43011 
AA845630 

H54178 

H20826 

AA333666 

AB64671 

AI540166 

AI683782 

AL038841 

AE86182 

AW451733 

AA001811 



303530 
300980 
331909 
321553 
301618 
319592 
318511 
327183 
313516 
318644 
321632 
324657 
300437 
319775 
314775 
337460 



301471 
312739 
319995 



337497 

322633 AA004534 
332177 F10812 



ESTs 
ESTs 

CR05_p2 9IJ6671884 
ESTs 

Cai0jJ2giI3983507 

ESTs 

ESTs 

ESTs 

ESTs; Weakly similar to FIBRILLIN 1 PRECURSOR [tisapfens] 

EST duster (not in UniGene) 

ESTs 

ESTs 

EST singleton (not In UniGene) wflh exon hfl 

CH.19_hsgI)58S7435 

ESTs 

ESTs 

ESTs 

ESTs 

CH.19JJ2 #671884 

CHj07Jtsgl|5868405 

EST duster (not in UniGene) with exon hit 

EST singleton (not in UniGene) wfth exon hit 

ESTs 

CRO5_p2gI|6671910 

ESTs 

ESTs 

CH.12Jisgi|5B66941 

ESTs 

ESTs 

EST cluster (not in UniGene) with exon hit 
ESTs 
ESTs 
ESTs 



Hs.164166 
Hs.129563 
Ks.128245 

Hs.163313 ESTs; Weakly similar to HQ ALU SUBFAMILY SB WARNING ENTRY Oil [Hjsapiens] 1.77 



13 
13 
13 
13 

ijB 
13 
13 

13 
1.79 
1.79 
1.79 
1.79 
1.79 
1.79 
1.79 
1.79 
1.79 
1.79 
1.78 
1.78 
1.78 
1.78 
1.78 
1.78 
1.78 
1.78 
1.78 
1.78 
1.78 
1.77 
1.77 
1.77 
1.77 



322934 A1483054 Hs, 158968 
325902 

W01813 
AI274851 
AI025527 
AA437300 
K92449 
T52760 
AA627356 
T26528 

AA029Q58 
AI752482 
AA419617 
AW451142 
AW449374 
AA504429 
AI149880 

AW297444 
AA995014 
AI318426 
H15355 



CH22LF6ENES.701JB 
H&208484 ESTs 
Hs.119824 ESTs 
Hs33722 ESTs 

CH.16j)2fli|6165199 
CR07Jisgl|5868375 
ESTs 

CR16J* gl]5887101 
Hs.12109 WD40protBtnOao1 
H&258744 ESTs 
HS222097 ESTs 
Hs.178210 ESTs 
HS.116406 ESTs 

EST cluster (not in UniGene) wfth exon ht 
ESTs 



1.77 
1.77 
1.77 
1.77 
1.77 
1.77 
1.77 
1.76 
1.76 
176 
1.78 
1.78 
1.76 
1.76 
1.76 



Hs.163315 

H&2Z7175 ESTs; Weakly similar to ID! ALU SUBFAMILY SQ WARNING ENTRY UD {H^apiens] 1.76 



316893 AA837332 



CH.01Jisgl|5867442 
Hs.135145 ESTs 

EST duster (not in UniGene) 
EST duster (not in UniGene) 
H&25562S ESTs 
H&257149 ESTs 

Hs.621 1 methyi-CpG binding domain protein 1 
Hs.188809 ESTs 

CH22_FGB!ES.780-5 
EST singleton (not in UniGene) with exon hit 
Hs.129544 ESTs; Weakly similar to ORF YULQ27w [Sxerevislae] 
Hs.155925 ESTs 
Ks.60887 ESTs 

CH.19Jisgi]5867423 
CH22LFGENES301-4 
Hs.153981 ESTs 
Hs.101433 ESTs 

CR21JisgItS456782 
EST duster (not in UniGene) 
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324826 AA704808 Hs.143842 ESTs 1.75 

311269 AI656924 Hs.174257 ESTs 1.75 

309375 AW075342 EST singleton {not In UniGene) with exon hit 1.75 

314171 AI821895 Hs.193481 ESTs 1.75 

311684 A1990741 H&252809 ESTs 1.75 

334387 CH22_FGENES.380J 1.75 

312195 AI300101 H&252222 ESTs 1.75 

315707 AI416055 Hs.161160 ESTs 1.74 

324349 AW501470 EST duster (not in UnlGene) 1.74 

300724 AI762929 H&206134 ESTs; WeaWy similar to similar to reverse transcriptase [Celegans] 1.74 

309906 AW339340 EST singleton (not in UniGene) with exon ha 1.74 

303714 AW501333 EST duster (not in UniGene) wifo exon hit 1.74 

318704 Z24981 EST cluster (not to UniGene) 1.74 

303027 AF111178 EST cluster (not in UniGene) wifli exon ha 1.74 

322601 W92924 EST duster (not in UniGene) 1.74 

319382 H93199 H&33665 ESTs 1.74 

315858 AA737345 EST cluster (not in UniGene) 1.74 

332243 N5S484 H&220540 ESTs; Highly similar to ARYL HYDROCARBON RECEPTOR NUCLEAR 

TRANSLOCATOR [H^apians] 1.74 

330951 H02566 Hs.191268 Homo sapiens mRNA; cONA DKFZp434N174 (from done DKFZp434N174) 1.74 

324044 AL045752 H&211519 ESTs 1.73 

320630 AA199847 EST cluster (not tn UniGene) 1.73 

327288 Oi01 M hsgq5867481 1.73 

314986 AI201367 Hs.142860 ESTs 1.73 

319078 H17255 Hs.144515 ESTs 1.73 

326278 CR17_hsgi!58B7269 1.73 

302552 H49792 EST duster (not in UniGene) with exon hft 1.73 

322322 AFQ86431 EST duster (not in UniGene) 1.73 

327075 CH^1_hsgqS531865 1.73 

317392 A1797588 Hs.145459 ESTs 1.73 

300810 AI076890 Hs.186949 ESTs 1.73 

315978 AA830893 Hs.1 19769 ESTs 1.73 

323903 AA773580 Hs.193598 ESTs 1.73 

330803 AA0046S9 Hs.1 50580 putative translation initiation factor 1.73 

309845 AW296802 H&255580 EST 1.73 

314963 AI689617 H&200934 ESTs 1.73 

311710 F09774 Hs.175971 ESTs 1.73 

315315 AI984592 Ks.15088 ESTs 1.73 

300378 AA663560 H&235873 ESTs; Weakly similar to K1 1C45 [Celegans] 1.73 

316141 AW303457 EST cluster (not in UniGene) 1.72 

319826 T71739 Hs.75442 albumin 1.72 

312961 AI033922 Hs.122517 ESTs 1.72 

334379 CH22._FGENES.379 J 1 1.72 

305854 AA862733 EST singleton (not in UniGene) with exon hit 1.72 

313031 N348Z7 Hs.186568 ESTs 1.72 

329728 CH.14_p2gi]6065785 1.72 

312090 N57692 Hs.1 18064 ESTs 1.72 

323341 AL134875 Hs.162388 ESTs 1.72 

302077 AA310580 Hs.1 32898 Homo sapiens chromosome 11; BAG CIT-HSP-311e8 (BC269730) 

containing the hFENI gene 171 

310766 AI971438 Hs.158824 ESTs 1.71 

311450 AI809985 Hs203340 ESTs 1.71 

311792 AW238064 H&253909 ESTs * 1.71 

321500 H71999 EST duster (not in UniGene) 1.71 

311948 T78791 H&241569 ESTs; Moderately smtr to 111] ALU SUBFAMILY SQ WARNING ENTRY UII [H^aplens] 1 .71 

302270 R56151 EST duster (not in UniGene) with exon hit 1.71 

329089 CHXJiS gi|5868614 1.71 

322331 AF086467 EST duster (not in UniGene) 1.71 

318235 AI080361 Hs.134217 ESTs 1.71 

304561 AA489792 EST singleton (not In UniGene) with exon hit 1.71 

312681 A1028149 Hs.193124 pyruvate dehydrogenase Unase; isoenzyme 3 1.71 

310250 A1478629 Hs.158465 ESTs 1.71 

338178 CH22_EM^C005500.GENSCAN^1^ ' 1.71 

338910 CH2a.DJ32liaGENSCAN.11-2 1.71 

321225 AL080073 H&251414 Homo sapiens mRNA; cDNA DKFZp564B1462 (from done DKFZp564B1462) 1.7 

322289 AA534550 Hs339 ribosomaJ protein S29 1.7 

319802 AT701469 H&202501 ESTs 1.7 

314022 AW452420 H&248678 ESTs 1.7 

314937 AA515602 Hs.1 52330 ESTs 1.7 
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300580 AA761322 H&220538 ESTs 1.7 

304388 AA262785 EST singleton (not In UniGene) with exon h& 1.7 

313421 AW339515 Hs.163700 ESTs 1.7 

309763 AW270182 EST singleton (not In UnKSsne) with exon rtft 1.7 

5 322092 AF085833 EST duster (not in UnKSene) 1.7 

315603 AA764758 Ha. 121 158 ESTs 1.7 

325031 T08597 EST duster (not En UniGene) 1.7 

327157 CRQ1_hsgi|5866841 1.7 

314809 AI741461 Hs.161904 ESTs 1.7 

10 320361 H67220 HS.14S406 nMasel 1.69 

324721 AW4023G2 Hs.43616 ESTs 1.69 

328624 CH.07 hsgi|5868246 1.69 

303344 AA255977 H&250646 ESTs; Highly similar to ubjquffirwxjnjugating enzyme [MjttuscuIus] 169 

328960 CR08Jisgi)6456775 1.69 

IS 315702 AA657501 Hs.146315 ESTs 1.69 

3G2385 AJ224172 Hs204G96 OpophBsn B (uteroglobin famSy member); prostHteln-fike 1.S8 

319699 R14537 EST duster (not In UniGene) 1.68 

309506 AW137700 EST singleton (not In UniGene) with exon h3 1.68 

330417 D84424 H&57697 hyaluronan synthase 1 1.68 

20 315296 AA876905 Hs.125286 ESTs 1.68 

328538 Cfi07_hsg!|5868485 1.68 

323923 AA354146 EST duster (not in UniGene) 1.68 

32Q303 AL0792B9 Hs.137154 Homo sapiens mRNA full length Insert cONA done BJROIMAGE 35971 1.68 

302967 AI927068 Hs.1 10853 ESTs; Weakfy similar to R1 0012,12 [C.elegans] 1.68 

25 310695 A1472124 Hs.157757 ESTs 1-68 

307512 AI273B15 H&242463 kerafin8 1-63 

338506 CH22_EMAC005500.GENSCAN.39r>10 1.68 

331722 AA1 95405 Hs.1 10347 Homo sapiens mRNA for alpha Integrin binding protein 80; partial 1.68 

301431 R05385 EST duster (not in UniGene) with exon hit 1-68 

30 318853 Z42977 K&21062 ESTs 1.68 

323032 AW244073 Hs.145946 ESTs 1.68 

317538 AW137772 Hs.185980 ESTs 1.68 

325780 CH.14_hsgt|6381853 U67 

321739 AL08Q280 EST duster (not In UniGene) 1-87 

35 319808 T58960 EST duster (not In UniGene) 1.67 

313443 AA249037 EST cluster (not in UniGene) 1.67 

331366 AA424754 Hs.43149 ESTs 1.67 

316443 AI797592 H&207407 ESTs 167 

322878 AA081820 EST duster (not in UniGene) 1.67 

40 330320 Cfi08j)2gi|5932415 1-67 

329081 CRX_hS 995868602 1^7 

334026 CH22_FGENES318.3 1.67 

317791 AI801500 Hs.128457 ESTs 1.67 

322235 ARJ86106 EST duster (not In UniGene) 1.66 

45 331148 R73816 Hs.17385 ESTs 168 

325452 CH.12_hsgi|5866941 1.66 

315106 AW452184 H&232100 ESTs 1.66 

326014 CH.16Lhsgi|5867160 1.68 

307130 Al 185234 EST singleton (not In UniGene) with exon ha 1.66 

50 300943 AA524545 Hs224630 ESTs 1.66 

319402 W21298 EST duster (not in UniGene) 1.66 

310889 AI457946 Hs.170437 ESTs; Weakfy similar to hyrierpolarization-adivated; cyclic 

nudeotide-gated channel 2 [H^apiens] " 1.66 

323371 AL135118 EST duster (not in UniGene) 1.66 

55 335568 CH22_FGENES.581_4 1.66 

320654 AW263086 Hs.1 181 12 ESTs 1.66 

338983 CH22_DA59H18.GENSCAN^-1 1.65 

330002 CH.16_p2 $6623963 1.65 

315343 AW205477 Hs.179891 ESTs 1.65 

60 334487 CH22J=GENES^95_9 1.65 

312169 A1064824 Hs.1 83385 ESTs 1-65 

309668 AW204480 H&253414 EST 1.65 

309518 AW148928 K&248895 EST 1.65 

307965 A1421641 EST singleton (not In UniGene) with exon hit 1£5 

65 316787 AW369770 Hs.1 30351 ESTs 1.65 

300835 AA401858 H&224843 ESTs 1.65 

338763 CK2^BAAC005500.GENSCAN517-16 1J65 

303327 AA232729 Hs.1 54302 ESTs 1-65 

313231 AW138993 Hs.1 63682 ESTs 1.65 
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334073 CH2£JGENES.327_28 1.65 

318901 T77138 Hs*7B5 RNA hattcasdHBtatsd protein 1*5 

326530 CH.19JisgI|5857441 1.65 

301128 AI8Q2877 H&210843 ESTs; Weakly sfrnDar to (U1039K52 [Helens] 1*5 

314043 AA827082 EST duster (rrct fnUnJGene) 1*5 

304387 AA236027 EST singleton (not In UniGene) with axon hit 1.65 

322932 AA099732 EST cluster {not In UniGene) 1*5 

337272 CH2£_FGENES.660-1 1.64 

332694 AA262768 Hs.243901 KIAA10S7 protein 1-64 

318996 Z44266 EST cluster (not in UniGene) 1.64 

315336 AW342028 Hs.256112 ESTs 1*4 

313329 AW293704 Ks.1 22658 ESTs 1.64 

318088 AW295409 Hs.137945 ESTs 1<64 

313835 AI538438 Hs.1 59087 ESTs 1j64 

320035 AA378974 Hs.1 30720 ESTs; Weakly similar to CELLULAR NUCLBC ACID BINDING PROTEIN [Ksaplens] 1j64 

309372 AW074330 EST singleton (not In UniGene) with exon h& 1-63 

324157 AW4Q2236 EST duster (not In UniGene) 1*3 

323929 AA354940 Hs.145958 ESTs 1-63 

302490 AA8855Q2 Hs.1 87032 ESTs 1.63 

333942 CH22_FGENES.301_8 1.63 

327469 Ca02jisgil5867772 1.63 

301918 AA476777 EST duster (not In UniGene) with exon hit 1-63 

315664 AI744068 Hs.160712 ESTs 1.63 

304405 AA282572 EST singleton (not in UniGene) wtffi exon hit 1.63 

310624 AB41594 Hs.157522 ESTs; Moderately simSar to env protein (H^apiens] 1.63 

319250 F11623 EST duster (not In UniGene) 1.63 

310608 AI962234 Hs.196102 ESTs 1.63 

317348 AI348076 Hs.831 34iydroxymethyt-3^mytgtutar^^ 1.63 

306513 AA989230 EST singleton (not in UniGene) with exon hft 1-63 

320807 AA08S110 Hs.188536 Homo sapiens done 24838 mRNA sequence 1.63 

303710 AI269069 K&250852 ESTs; Highly similar to ubkjuffin hydrolyzing enzyme I (Rsapierts] 1.63 

328291 CH.07„hs gi[5868363 1 .63 

304236 W93278 EST singleton (not in UniGene) with exon hft 1.63 

317683 AI791700 Hs.127893 ESTs 1.63 

311960 AW440133 Hs.189690 ESTs 162 

312834 AI028309 Hs.114246 ESTs 1*2 

325328 CH.11_hS0i]5666ff75 1*2 

313663 AB53261 Hs.169813 ESTs 1*2 

327528 CHXXUtsgl]6381682 1.62 

300429 AW449679 * Hs.156739 ESTs; Highly similar to XG GLYCOPROTEIN PRECURSOR [H.saptens] 1j62 

305169 AA663131 EST singleton (not h UniGene) wfih exon hft 1.62 

316621 AI021996 Hs.122138 ESTs 1*2 

329868 CH.14j32gij6272129 1*2 

316035 AI744130 Hs.131201 ESTs 1-62 

300492 AL031709 muffipte UniGene matches 1-62 

316532 A1307229 Ha 184304 ESTs 1-62 

332048 AA496019 H&201591 ESTs 1-62 

307113 AI183686 EST singleton (not in UniGene) with exon hit 1-62 

319127 N49478 EST duster (not in UniGene) 1.62 

331155 R87650 Hs*3439 ESTs; WeaWy similar to Bfl ALU SUBFAMILY J WARNING ENTRY HI! [risapiens] 1*1 

338220 CH22 EM4C005500.GENSCAN.246-9 1.61 

315763 AW515270 Hs, 11 8342 ESTs 1*1 

323571 AA984133 Hs.1 53250 c-CbHntemcting proteh 1.61 

312240 R28628 Hs£03669 ESTs 1*1 

304569 AA490934 EST singleton (not In UniGene) with exon hit 1.61 

313179 AI076101 Hs.131704 ESTs 1*1 

326858 CH^0jtsgf|8552462 1*1 

317276 AI823847 Hs.129988 ESTs 1*1 

312572 AA350125 Hs.1 87499 ESTs 1.61 

311832 AW451654 H&257482 ESTs 1*1 

302103 AA452310 H&26090 ESTs; Weakly simSar to T20B12.1 [Cetegans] 1*1 

308413 AI636253 Hs.196511 EST 1*1 

310077 A1620617 Hs.148565 ESTs 1*1 

337780 CH22_EM^X)00097.GENSCAri121-2 1*1 

327798 , Ca05jisgl|5867882 1*1 

308352 AJ610791 EST singleton (not In UniGene) wflh exon hft 1*1 

324539 AI378032 H&125B92 ESTs 1*1 

303232 AA437414 EST duster (not In UniGene) wQh exon hft 1*1 

337884 CH22_EMJO05500.GENSCAN.54-2 1*1 
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303520 


AA397546 


Hs.1 19151 


ESTs 


1£1 


303481 


AA336839 




EST duster (not in UniGene^ with axon Kit 


1X1 


314461 


AA548589 


Hs.105846 


ESTs 


1.61 


300327 




H&245B93 


cola 


1 6 


323473 


AA262442 




EST duster fnot In UnlQfinst 


IX 


326154 






(HJ 17 he 0(15067170 


1 A 


331920 


AA446885 


Hs.99087* 


ESTs: Moderatefv similar to ZINC FINGER PROTEIN 141 IH-saotensI 


1J 


323827 


AW406878 




CRT cluster foot In Unifi&ns) 


IX 


322452 


W56710 




COT ptitcfor /nnt In 1 lnlftonn\ 

CQI UUMBI \IWl 111 UlllWKJllOj 


1 6 


310597 


AI73S071 


Hs.158515 


ESTs 


IX 


307871 


A13GSGBS 




CO 1 QuUJIOU/Jl (lUH 111 UlUUtUW/ WlUl UAUfl IIU 


1 ft 
1.0 


322215 


AP088005 




bui uipui ymtui uiiiixoiiBj 


1 6 
1 .o 


316420 


AI139357 


Hs.143837 


ESTs 


1X 


332217 


HS8987 


Hs.102383 


EST 


16 


324937 


M79230 


Flo* 


CGTa 

WWlO 


1 A 


320543 


AF052176 


Hs.1 58529 


Homo sapiens done 24457 mRNA sequenos 


IX 


300674 


AW467388 




EST duster (not In UniGene) with exon hit 


1X 


315193 


AI241331 


Hs.131765 


ESTs 


1X 


310713 
019/ to 


R9420d 




EST duster (not In UniGene) 


1X 


VM910 
OUI£ IU 


AI37QQR9 




ESTs 


1X 


309365 


AW0728S1 




EST singleton (not In UniGene) wOh exon hit 


1.6 


•W 1/1/13 
0&I4U3 




He 9A7MA 


adenylate kinase 3 


IX 


321908 


AA376936 


H&2G998 


ESTs 


IX 


AWO 


MrWOfcOOl 




EST duster (not In UniGene) with exon hit 


1X 




ML 10003/ 




ESTs 


1.6 


QinRQQ 
OIUOoo 






EST duster (not In UniGene) 


1X 


tjOOlaO 






CH22_FGENES^8_15 


IX 


O3AA0O 






CH22J=GENES^25_12 


1.6 


319097 


A1352038 


Hs.167160 

It* 19/ IQ9 


ESTs 


1X 


011440 


AW9AA9V 
AVnsU4£0Y 


Lie 1Q07TW 
no. 1 ocr UO 


ESTs; Weakly similar to !1B ALU SUBFAMILY J WARNING ENTRY IW (Rsaplens] 1.59 


317736 


A1361722 


no. i ott iv 


ESTs 


1X9 


000147 


AUQDQ01 




EST singleton (not In UniGene) with exon hit 


1X9 


313489 


AA017492 


Wr 135S55 
no. iw909v 


ESTs 


1X9 


01696Q 




Up 19O0CQ 
no. \Cf.a*J& 


ESTs 


1X9 


OCvoOO 






CH21JhSflil5867657 


1X9 


O 14/01 




U« 0ft9Q79 


ESTs 


1X9 


328397 






CR07jiSfli|5868397 


1X9 


OOlOfv 


AA4610R4 


no. lO/Dr f 


ESTs 


1X9 




MQ1410 


He 19090 


ESTs 


1X9 


3IU0U9 


A I9Q91 01 


H« 160006 
no. iovwO 


ESTs 


1X9 


016091 


AI1A7»v£.q 

Ml 


Hq 114179 
no. I l*flr& 


ESTs 


1X9 


329040 


A&26949 


H&1443&3 


ESTs 


1X9 


301161 


AA731S13 




EST cluster (not In UniGene) with exon hit 


1X9 


oUUMO 


AfflOAQQft 
AlUcOOOO 


Hc114£0Q 


ESTs 


1X9 


319142 


R37368 




EST duster (not In UniGene) 


1X9 


313528 


AW152263 


H&249243 


ESTs 


1X9 








EST singleton (not in UniGene) wfth exon hit 


1X8 


330123 






CH.19_p2g[|6671869 


1X8 


397810 






CR05JtsgI|5687968 


1X8 


318950 


AJ470014 

tVfl OO I** 


Hs_1 34603 
no. lOHOuo 


ESTs 


1X8 


vW/W 


A! 0340 94 


Hs.1 59476 
risk tiKJt/o 


tubuBn; alpha; ubiquitous 


1X8 




AA9909TC 


Hc9AAQ3A 
n&&4O0OO 


ESTs 


1X8 


o i row 


AI690960 


H« 901*345 


ESTs 


1X8 


320725 


A A7/W31Q 
MM/ UOO lb 


He 10AQR7 


ESTs 


1X8 


311339 


AW292247 




ESTs 


1X8 


334893 






CH22_FGENES.452^7 


1X8 


316730 


AA398215 




EST duster (not In UniGene) 


1X8 


315889 


AWZ71639 


HS521744 


ESTs 


1X8 


303702 


AW500748 


HS224961 


ESTs; Weakly simtar to 73 kDA subunS of cleavage and potyattenytation 
speefficity tactor [H sapiens] 


1X7 


315088 


AI492660 


Hs.170935 


ESTs 


1X7 


332514 


AA156499 


HsX454 


protein kinase; cAMP-dependent; regulatory; type D; alpha 


1X7 


335549 






CH2JLFGENES.576J0 


1X7 


329532 






CH.10 _p2 gl|3983505 


1X7 


323140 


AA180467 




EST duster (not In UniGene) 


1X7 


313168 


AI801098 


Hs.151500 


ESTs 


1X7 


337896 






O^EJyt^X)05500.GENSCAN5&3 


1X7 


330656 


AA319514 


H&211093 


ESTs 


1X7 


324585 


AI823969 


Hs.1 32678 


ESTs 


1X7 
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317151 


AW298185 


308818 


AI819700 


326547 




318833 


H06234 


320488 


R31386 


306629 


AI124514 


338083 
316868 


AI660898 


310937 


A1472880 


328638 
310074 


AI651039 


327058 




320076 


AI653733 


322345 


AF086529 


314731 


AI745498 


318687 


K49619 


303841 


AS934464 


302370 


AJ009849 


322571 


AF156271 


318050 


AI052093 


3Q3388 


AL039804 


323758 


AA833858 


328369 
329415 




303915 


AW468839 


338794 




303074 


AA243481 


318807 


RJ8434 


334287 




311828 


AW0247B8 


304592 


AA505833 


300785 


AA682913 


304921 


AA603092 


324605 


AW5Q2851 


324473 


AW501163 


300566 


K86709 


314165 


AA761265 


3Q2868 


M157392 


314034 


AI299137 


325389 
331849 


AM17078 


320536 


AA331732 


303347 


AA258033 


315769 


AA744875 


317031 


AA973297 


300203 


AB27065 


304037 


T26438 


322613 


AW160507 


317887 


AW138174 


322313 


AF086386 


323992 


AW411383 


325303 
312701 


AM57663 


304787 


AA582678 


305849 


AA861571 


314557 


AA401S67 


316507 


AI381515 


315023 


AA533505 


314920 


AA513406 


323097 


Z44354 


325043 


W27919 


307892 


AI376088 


324573 


AA491600 


313092 


A1923673 


324696 


AA641092 


303019 


AF098363 


317158 


AW59140 


3Q9536 


AW151933 


301568 


AI146423 



H&255735 ESTs 
H8.208231 EST 

Cai9Jisgl|5867307 
K&24888 ESTs 

EST duster (not In UniGene) 

EST singleton (not in UniGene) w&h exon hit 

CH22_EM^CO055D0.GENSCAN.174-1 
Hs.195602 ESTs 
Hs.170480 ESTs 

CHL07_hsgi]5004473 
Hs.148559 ESTs 

Cti21_hsgi[6531865 
H&204079 ESTs 

EST duster (not tn UniGene) 
H&204579 ESTs 
Hs.127301 ESTs 

EST duster (not In UniGane) with exon hft 
Hs.199297 Homo sapiens GNAS1 gene encoding NESP55 

EST duster (not in UniGena) 
H&133132 ESTs 

EST duster (not In UniGene) with exon hit 

EST duster (not In UniGene) 

Cri07.hsgij5868386 

CH.Y_hsgI|5868874 
H&257767 EST 

CH25L^VC005500.GENSCAN528-1 
Hs.127320 ESTs; Weakly similar to K1AA0346 [Rsaplens] 

EST duster (not in UniGene) 

CH2£_FGENES569_17 
H&233374 ESTs 
Hs.162017 EST 

H&247179 ESTs; Weakly similar to KIAA0319 [H^apiens] 

EST singleton (not tn UniGene) with exon hit 
H&249978 ESTs 

EST duster (not In UniGene) 
H&21371 son of sevenisss (DrasophBa) homotog 1 
H&221281 ESTs 

EST duster (not In UniGene) with exon hit 
H&154214 ESTs 

CH.12.hsgI]5866921 
Hs.193767 ESTs 
Hs.137224 ESTs 

EST duster (not In UniGene) with exon hit 
Hs.189413 ESTs 
Hs.126101 ESTs 
H&224877 ESTs 

EST singleton (not In UniGene) with exon ha 

EST duster (not In UniGene) 
Hs.130651 ESTs 

EST duster (not In UniGene) 
Hs.169668 ESTs 

CR11_hspj)586690a 
Hs.128127 ESTs 

EST singleton (not in UniGene) wffli exon hit 

EST singleton (not in UniGene) wBh exon hit 
Hs.128647 ESTs 
Hs.158381 ESTs 
Hs.185844 ESTs 
Hs.152307 ESTs 

Hs.180950 guanine nucleotide binding protein (G protein); q polypeptide 
H&32944 Inositol polyphosphata^hosphaiase; type I; 107kD 
Ks.158759 EST 
H&161942 ESTs 
H&212827 ESTS 
HS257339 ESTs 

EST duster (not tn UniGene) with exon hit 
Hs.129109 ESTs 

EST singleton (not in UniGene) with exon hft 
H&146709 ESTs 
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315674 AA651923 H&191850 ESTs 153 

321861 N79341 EST duster (not In UniGene) 1.53 

310890 A1184510 H&143728 ESTs 153 

330036 CR17j>2 gl]BO42048 153 

316907 AA843868 Hs.190567 ESTs 153 

312299 AA972712 Hs.174818 ESTs 153 

331128 R51381 H&23423 ESTs 153 

305177 AA663591 EST singleton (not In UniGene) with exon hit 153 

337685 CH22_EM^C000097£ENSCAN.77-1 153 

335290 CH22_FGBIES527_3 1 53 

308895 AI858667 EST singleton (not In UniGene) with exon hit 153 

307944 AI418246 EST singteton (not in UriiGene) with exon hit 153 

300867 AW340374 Hs.1 21 033 neural precursor ceO expressed; developmental dowrwegutated 1 153 

335320 CH23_FGENES534_7 153 

329841 CR14_p2giI6672062 153 

317916 AB65071 Hs.159983 ESTs 153 

332901 CK22 FGENES56J2 153 

305413 AA724659 EST singleton (not in UniGene) with exon hit 153 

316707 A1016387 Hs.1 84406 ESTs 153 

313693 AW469180 Hs.1 70651 ESTs 153 

316101 AA922236 H&221037 ESTs 153 

320796 AF038968 Hs.1 84543 secretory carrier membrane protein 1 153 

307451 AK48615 EST singleton (not in UniGene) with exon hit 153 

323648 AI679968 Hs.152060 ESTs 153 

331482 N27515 Hs.40296 ESTs 153 

318059 AIQ23175 Hs.167022 ESTs 153 

325958 CR16_hsgij5867142 153 

315736 AA664265 Hs230213 ESTs 153 

314740 AW015687 Hs.1 19427 ESTs 152 

314117 AA224368 H&185164 ESTs 152 

301646 AA313954 EST cluster (not In UniGene) with exon hit 152 

338752 CH23_EM:AC005500.GENSCAN.513-10 152 

309314 AW009312 EST singleton (not in UniGene) with exon hit 152 

301445 AI208364 Hs.128233 ESTs; WeaWy slmflar to REGULATOR OF CHROMOSOME 

CONDENSATION [H .sapiens] 152 

308501 AI685263 H&201150 EST 152 

312330 AA635305 Hs.121574 ESTs 152 

318040 AI018150 Hs.148781 ESTs 152 

336205 CH22_FGENES.719J0 152 

325701 CH.14.hsgi|5867028 152 

315009 AW1 89460 H&208358 ESTs 152 
303121 AW407585 H&27769 ESTs; Weakly slmfiar to mCAC [Mjnuscutus] 152 
309271 AI986221 EST singleton (not In UniGene) with exon hit 152 
326385 CH.07 hsgip868395 152 
307700 A1318545 EST singleton (not in UniGene) w8h exon hit 152 
314591 AW103292 H&245328 ESTs 152 
304484 AA432067 H&258373 ESTs 152 
304382 AA232873 EST singleton (not ki UniGene) wfih exon hft 152 
304232 W52674 EST singleton (not in UniGene) wflh exon hit 152 
309853 AW298169 Hs57553 toustetMke kinase 2 152 
312604 AW207346 Hs.143202 ESTs 152 
313134 N63406 H&258697 ESTs 152 
330391 AF0 15950 Hs.1 15256 telomerase reverse transcriptase 152 
314342 AI873046 H&258775 ESTs 151 
305977 AAS87293 EST singteton (not in UniGene) w0h exon hit 151 
301165 N85789 Hs224155 ESTs; WeaMy similar to PTER1N4-ALPH ACARBtNOLAMINE 

DEHYDRATASE [H sapiens) 151 

300613 A1932294 Hs.249604 ESTs; Weakly similar to B-CELL LYMPHOMA 6 PROTEIN [Ksapiens] 151 

324124 AJ554212 Hs, 185664 ESTs; Weakly similar to SERINE/THREONINE-PROTEIN KINASE NRK2 (H^aplensJ 151 

308037 A1458207 Hs.174181 ESTs 151 

323909 AL043148 Hs.1 86257 ESTs 151 

315464 AW139500 Hs.116135 ESTs 151 

306700 AIQ22056 EST singteton (not in UniGene) wflh exon hit 151 

337976 Ofc2JBM:AC0Q5500X3ENSCAN.107-1 151 

306355 AI083982 EST singteton (not in UniGene) wffii exon hit 151 

311045 AI569399 Ks.174746 ESTs 151 

315010 AA531082 H&240049 ESTs 151 
310205 AW025248 H&202445 ESTs 151 
310759 AW135924 Hs224883 ESTs 151 
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310954 AW449044 Hs.171298 ESTs 151 

312019 T77046 Hs.188750 ESTs 151 

334773 CH22J=GENES.430_5 151 

332043 AA490831 Hs.128056 ESTs 151 

5 322950 AA296219 EST cluster (not (nUniGena) 151 

33792) CH22_B^AC005500.QBiSCAN£7^ 151 

328993 CH09_hs glJ58S8538 151 

309245 AI972447 EST singfeton (not In UnlGene) wfth exon hit 151 

312172 AI222168 Hs.191168 ESTs 151 

10 304039 T47349 EST singleton (not In UniGene) with exon hit 15 

301329 AI149653 Hs.190496 ESTs 15 

313373 AI949248 Hs£0Q381 ESTs 15 

324248 AW504918 EST duster (not In UnlGene) 15 

308771 AI8Q9301 EST singtston (not In UniGene) with exon hit 15 

IS 334935 CH22.FGENE&464 3 15 

319764 AA018827 EST duster (not tn UnlGene) 15 

318519 T27135 EST duster (not in UniGene) 15 

332807 CffiejGENES.7^ 15 

322310 AF086376 EST duster (not h UniGene) 15 

20 324557 AA489166 Hs.156933 ESTs 15 

332118 AA809585 Hs.1 62689 EST 15 

319539 R09Q27 EST duster (not in UniGene) 15 

313149 AW291092 H&201058 ESTs 15 

329722 CH.14_p2gI]6065785 15 

25 323514 AA861209 EST duster (not in UniGene) 15 

308078 A1472621 EST singleton (not in UniGene) with exon hit 15 

337965 CH22_EMAC005500.GENSCAN.100-10 15 

335905 CH2?_FGENES.635 13 15 
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TABLE 14A shows the accession numbers for those primekeys lacking unigenelD's for 
Table 14. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Ptey: 

CAT number 
Accession: 



10 



15 



20 



25 



30 



321632 
313833 
322310 
35 322313 



Unique Eos probeset Identifier number 
Gene duster number 
Genbank accession numbers 



Pkey CAT number Accession 



322064 2345HJ 
321409 197838J 

322092 46678.1 
321452 212379.2 
313603 199797J 
.1 



321500 
313733 
322215 



322331 

322347 
40 322370 
321739 
321781 
314570 
300129 
322452 
321881 
323140 

321914 
322571 
322574 
314753 
300370 



46806.1 

552826.1 

441212.1 

47003.1 

47070.1 

286374J 

120893J 

47378.1 

47386.1 

47434.1 

47467.1 

47S37.1 

47545.1 

187612.1 

43998.1 

1511778J 

280469J 

635249.1 

497108.2 

1651920J 

159551J 



85114J 

22297.1 

S941SJ 

311451J 

3910.2 



322601 577912.1 
322613 34330 J 



316055 409389J 
323316 981458J 
300492 25768.1 



BE261397 Z78343 BE176419 AA383657 N90640 AA334052 AW955761 BE536232 AA374087 AA584776 

N71838 AA282003 T54072 AA761419 H929B6 AI831371 AI095435 AI690247 R99331 AW9641 10 AA975590 AA346128 

H94196 C03864 

AF085833 R69689 AW341677 AA923375 BE327566 AW630415 R69601 AW615339 
AW962489 H64300 AA329527 
AA284333 AW4681 19 AA284334 AA81 0992 

AB040928 TB4673 A1289313 AI536039 Z44366 BE141499 D60116 D61468 D59945 AA419503 R28090 R72988 K03255 
AI1891 12 AI912312 AW511 018 AI401349 AW470144 C14624 AI335707 Z40300 AI014456 D60269 0601 15 T16722 Ai370673 
D 50270 

H53744AF075088 H53797 
BE004271 AI248023 AI022157H71999 
AA766346 AA809877 AA836116 AW469598 AW977404 
AF088005 N51B16N51731 

AF086106 AI183589 AW665594 N71795 AA722627 AW6S5373 AI300251 

AW812795 AA419617 H87827 AW299775 AW382168 AW382133 BE171659 AW392392 BE171641 AA541393 

AA766825 AA81 1 180 AA085906 AI762946 AW977820 

AF086376 W77804 W72689 AA837735 

AF086386 W77947 W72708 

AP086431 AA886756 AI557237 

AF086467W81444W81445 

W95298 AP086529 AI912190 AW294159 AI458747 W94782 

AF086538 W95969 AI631911 W95835 

AA330095 W25112 AA249401 

AL0802BO 773124 KQ2689 AU080281 

D78667 D78871C18258 

AAS04776 AA405696 AA405962 

AWQ28820AB19068 

AI147202W56755W56710 

N79341 N99082N47551 

AA180467 AA449184 AA464831 AA505048 

T55958 T57205AF147346 

AA011603 N58604N58611 

NM.016102 AF156271 AA781868 AW152318 AW770403 AA909463 AA482998 AA758672 
AF156548 AA639797 Af675267 AJ825497 AI823355 
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318635 163534J 
319699 747196J 
319713 1689356.1 
319761 75324J2 
319764 88596.1 
319808 7069.3 
321040 193331.1 
320409 43709 J 



319881 
320488 
321121 
321205 
321253 
314043 
320630 
313435 
313443 
313472 
321348 
314138 
320712 
321383 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



308521 
308561 
308617 
60 308771 



65 



.1 
.1 

1545647J 

81249J 

375160.1 

155125J 

17685J2 

4435Z7.1 

82292.1 

82811.1 

41762-1 

179950.1 

57156.2 

41924J 



312996 187327.1 

306513 

306537 



306700 
308078 
306813 



329722 c14j2 

329728 c14_p2 

306690 

308100 

308147 

306929 



303019 41850.1 

303084 44211.1 

305092 AA642912 

305169 

305177 

305235 

305413 



AA742999 Z43272 AA345258 AW956677 AA031942 

W19657 BE616760 BE259848 BE382680 BE615587 AT934464 AA322745 T07155 AW961174 AA307302 Z41888 AA621992 
AA188400 AW770608 AI147458 All 48408 AI696291 AA972591 
T19204 T38109 T36107 

R09027 AA344892 AA329574 AW955648 AW978708 AI567804 AI378935 AWD14657 AI804134 R08922 N92947 BE546788 

F08365 Z43395 R54298 

T89949 AA654769 AA664550 AW975264 

Z44286 H06384 AV655948 

R17531 AW960899 AA338366 AW673294 BHM7729 BE047722 AA330746 AW841797 K05030 All 421 05 R12654 

AI458682 H24240 R14537 R18426 AW867082 

R24204R15712T84695 

AW630974 BE005208 R84237 AA724997 AA334867 AW955777 R18816 

AA019827R18947H46852 

T58960 AA609180 AA621130 AI927236 AA431075 

AA261830 AW967855 H26953 AA262478 

AA226869 AA296516 AW959753 AA186390 AL359B1 9 AA356195 M148427 R22748 AI033624 BE548853 H95327 

AW579751 BE561649 AA397533 BE6171 36 AA236444 T89946 AA247450 N55777 W38725 AI743846 AI808406 AA922229 

AI051464 VV04713 R1 1251 W19656 AI04231 9 AA489276 A1224533 H95274 AW269958 T8931 1 AI890088 AI862754 

AI830968 A1669336 AI589780 AA534557 AW273839 AI338155 AI126632 N83542 BE046048 AA807028 AA848107 

AW1 87978 AA976930 AA148428 AI289304 A1524262 A1625961 AA773469 A1222288 AI280054 AI242371 AA227222 

AA973329 AA296517AA829436 AA234528 All 49769 A1557865 AA936939 AJ590681 AW469308 AI689531 AA486419 

AI422Q51 AKJ57252 AA626941 AW75352 AW247913 AI222370 AA670122 AW 198034 AA48641 8 AI363794 AA380739 

H51289 H44619 H46391 R86024 H51892 T72744 

AI817338 R32883 AA595590 AI743065 R31386 

W23285 H42714 F25381 F37215 

AA002047 N72537 K54142 H81580 

AA610649AI699484H59558 

AA827082 AA732246 AA1 67611 AA830741 

AA199847 AA410224 R53323 AW936567 AW936569 AW936568 AW936571 

AA769123 AA831715 AW977666 W92553 

AA005125 W95019 W93335 AA249037 

AA007374 AA007466 AI816886 

Z49978 D61703U3016B 

AA740816 AA654854 AA229923 

R66667 R65678 R82673 W73128 R83101 

AW968556AJ238555 AW96S731 AJ002574 AA459446 H70260 AW977557 AA767351 AW268572 AA81 0719 AI698677 

AI300460 AA907450 AA649224 T07415 AB36898 BE018515 A1279865 BE047421 

AW368634 AI702169 AI2451 79 AW368646 BE545574 AA249018 AW368633 N27553 

AA989230 

AA991705 

AA994530 

AI000320 

AI000929 

AKB2056 

AW72621 

A1066544 

AI075803 

AI083982 



AI092235 
A1475949 
AI498991 
AI124514 
AI610791 
AI624497 



AI701559 
AI738720 
A1809301 
AI824829 



AF098363AF098365 
AF174008 AF174Q27 AF174106 

AA663131 



AA670480 
AA724659 
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305849 


AA861571 


305854 


AA862733 


307113 


AI183688 


307130 


- A1185234 


305937 


AA883238 


305977 


AAB87293 


307451 


AI248615 


307513 


AI274307 


307848 


AI364186 


307871 


AI368665 


307881 


AI370434 


307932 


AJ230822 


307944 


AI418246 


307954 


AI419692 


307965 


AI421641 


309245 


AI972447 


309271 


AI986221 


309365 


AW072861 


309372 


AW074330 


309435 


AW090537 


309506 


AW137700 


309536 


AW151933 


309709 


AW242630 


325417 c12jis 




325450 d2_hs 




325452 C12J1S 




309815 


AW292760 


309839 


AW296076 


309849 


AW297444 


309906 


AW339340 


302705 31765J 


UQ9060U09061 


304037 


T26438 


304039 


T47349 


304236 


W93278 


304257 


AA053294 


304382 


AA232873 


304405 


AA282572 


304561 


AA489792 


304569 


AA430934 


304787 


AA582678 


304921 


AA603092 


327819 QJ5J» 




304968 


AA614308 


306382 


AA968967 


331263 47479J 


AW780192 AA015718 W02571 


332252 1663967J 


N63882 T91174 
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TABLE 14B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 14. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digs numbers In this column are Genbank Identifier (QQ numbers 

Strand: Indicates DNA strand from which exons were predicted. 

NLposffiorc Indicates nucleotide posffions of predicted exons. 



Pkey 


ner 


strand 


111 ___lH-._ 

NLPOSluOfl 


332807 


Dunham, 1. etei 


Phis 


297686-297808 


332808 


Dunham, 1. etaL 


Plus 


MAM. AAOAM 

298277-298360 


332812 


Dunham, t etal. 


Plus 


309688-310561 


332901 


Dunham, L etaL 


Plus 


1841954-1842090 


333149 


Dunham, 1. etaL 


Plus 


3574317-3574413 


333916 


Dunham, L etaL 


Phis 


8298994-8299169 


334026 


Dunham, L etaL 


Plus 


*N 4 f\f%f» »f% A4AAAOJ 

9196549-9196681 


334061 


Dunham, (. etal. 


Plus 


9686941-8687077 


334073 


Dunham, L etal. 


Plus 


9792201-9792374 


334150 


Dunham, L etaL 


Plus 


10529221-10529854 


334379 


Dunham, I. etal 


Plus 


13908356-13908467 


334719 


uunrcim, t etal 


PIUS 


15778859-15779026 


334773 


Dunham, 1. etaL 


Plus 


16235169-16235328 


334893 


Dunham, L etal. 


nil m 
PIUS 


1 9302753- 19302891 


334935 


Dunham, I. etal. 


Pius 


20108247-20108373 


335148 


Dunham, L etal. 


Plus 


f% Jk IAJ AAA JAJ J If 

21491292-21491457 


335320 


Dunham, 1. etaL 


Plus 


AAP J Ad AA AAPiAAJA 

22542132-22542246 


335568 


Dunham, 1. etal 


Plus 


24935021-24935655 


335586 


Dunham, L etaL 


Plus 


24990333-24990497 


335601 


Dunham, total 


Phis 


25044923-25045157 


ddUWO 


ntifiham 1 «♦ al 

uuniMun, i. QuBL 




OQA1 0700.90(11 QA77 


336123 


Dunham, L etal 


Plus 


30051089-30051186 


336268 


Dunham,! etaL 


Plus 


31997555-3199804O 


337173 


Dunham, total 


Plus 


23624127-23624224 


337460 


Dunham, t etal. 


Plus 


32536159-32536395 


337685 


Dunham,!, etal. 


Plus 


3547161-3547245 


337736 


Dunham,!, etal 


Plus 


3850500-3850643 


337780 


Dunham, L etal 


Plus 


41137934113390 


337965 


Dunham, total 


Plus 


7034267-7034392 


337976 


Dunham,! etal 


Plus 


7166011-7166119 


338030 


Dunham, L etal 


Plus 


8072708-8072827 


338112 


Dunham, 1 etal 


Plus 


10391398-10391600 


338165 


Dunham, L etaL 


Plus 


12205719-12205875 


338178 


Dunham, l.etal 


Plus 


12800037-12800161 


338427 


Dunham,! etal. 


Plus 


19685043-19685354 


338506 


Dunham, 1 etal. 


Plus 


21221871-21221953 


338794 


Dunham, t etal. 


Plus 


27114697-27114763 


338910 


Dunham, 1 etal. 


Plus 


28795375-28795551 


339047 


Dunham, 1 etal. 


Plus 


30760783-30760968 


332864 


Dunham, 1 etal 


Minus 


1390386-1390296 


332933 


Dunham, 1 etal 


Minus 


2035780-2035681 


333193 


Dunham, letal 


Minus 


3832893-3832494 


333712 


Dunham, letal 


Minus 


7286177-7286073 


333940 


Dunham, letal 


Minus 


8523830-8523671 


333942 


Dunham, letal 


Minus 


8552629-8552330 


334287 


Dunham, Letal 


Minus 


13294116-13293871 


334387 


Dunham, letal 


Minus 


13946021-13945781 


334487 


Dunham, letal 


Minus 


14432191-14432132 


334913 


Dunham, letal 


Minus 


19463909-19463815 


335109 


Dunham, letal 


Minus 


21325792-21325687 


335250 


Dunham, letal 


Minus 


21952922-21952828 
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335288 Dunham, LetaL Minus 

335290 Dunham, LetaL Minus 

335549 Dunham, LetaL Minus 

335862 Dunham, LetaL Minus 

S 335864 Dunham, LetaL Minus 

335905 Dunham, LetaL Minus 

338205 Dunham, LetaL Minus 

338276 Dunham, L etai. Minus 

336433 Dunham, LetaL Minus 

10 336605 Dunham, LetaL Menus 

336616 Dunham, LetaL Minus 

336679 Dunham, LetaL Minus 

3 37043 Dunham, L etai. Minus 

337272 Dunham, LetaL Minus 

IS 337357 Dunham, L etaL Minus 

337393 Dunham, LetaL Minus 

337497 Dunham, I. etai Minus 

337646 Dunham, LetaL Minus 

337920 Dunham, LetaL Minus 

20 338083 Dunham, I. etaL Minus 

338220 Dunham, LetaL Minus 

338752 Dunham, LetaL Minus 

338763 Dunham, LetaL Minus 

338983 Dunham, LetaL Minus 

25 339209 Dunham, LetaL Minus 



325240 5866848 Minus 

329532 3983505 Plus 

329522 3983507 Minus 

329519 3983510 Plus 

30 329511 3983514 Pius 

325326 5866675 Pius 

325303 5866908 Minus 

325389 5866921 Plus 

325417 5866925 Minus 

35 325450 5866941 Minus 

325452 5866941 Minus 

325498 5866967 Pius 

325587 6682462 Plus 

325602 5866994 Pius 

40 325701 5867028 Minus 

325780 6381953 Pius 

329722 6065785 Minus 

329728 6065765 Minus 

329666 6272129 Plus 

45 329815 6624888 Minus 

329841 6572062 Minus 

325824 5867048 Minus 

325868 5887076 Minus 

325902 5867101 Minus 

50 325958 5867142 Plus 

326014 5867160 Minus 

329941 6165199 Minus 

330002 6623963 Pius 

326154 5867170 Minus 

55 326023 6867245 Pius 

326278 5867269 Plus 

330036 6042048 Pius 

326547 5867307 Minus 

326495 5867423 Plus 

60 326507 5867435 Minus 

326505 5867435 Minus 

326506 5867435 Minus 
326530 5867441 Minus 
326508 6682496 Plus 

65 330120 6671864 Minus 

330123 6671869 Minus 

326858 6552462 Minus 

326983 5867657 Minus 

327014 5867664 Plus 



22304275-22303770 

22309950-22309891 

24666203-24666128 

26690300-26690125 

26694537-26694382 

26988888-26988719 

3047745&-30477311 

32093320-32093181 

34067540-34067425 

15616509-15616358 

26021027*26020848 

2035790-2035681 

17407330-17407251 

28241476-28241307 

30906179-30906109 

31471747-31471569 

33371317-3337125B 

26486892648632 

605164*6051510 

9318438-9318301 

14166440-14166104 

26421374-26421135 

26628148-26626009 

29908865-29908702 

32492953-32492593 

32301-32650 

4293743014 

35265-35458 

18407-18597 

20965-21325 

4772648024 

73556-73630 

239672-239759 

110635-110745 

435379435552 

704103-704202 

173372-173930 

126724-126967 

79122*79251 

72936-73046 

63634-63873 

112713-112992 

207544-207741 

98307-98446 

68431-68720 

40181-40331 

42450-42833 

94358-94628 

127729-127842 

53437-53550 

10358-10447 

34319-34411 

46097-46158 

7103-7179 

171799-171896 

75250-75903 

117120-117216 

623877-623870 

11843-11930 

13038-13111 

8818-8949 

9368-9509 

303000*303122 

78904*79112 

127553-127656 

35311*35406 

69337-69870 

16023-16581 

1017630-1017788 
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326930 6456782 

326920 6456782 

327058 6531965 

327061 6531865 

327075 6531965 

327120 6531970 

330126 6093735 

327157 5866841 

327183 5867442 

327192 5867445 

327288 5867481 

327469 5867772 

3Z7489 6004459 

327526 6381682 

327574 5887818 

327665 5867839 

327752 5887949 

327819 5867968 

327796 5867882 

330260 6671884 

330282 6671910 

328076 5868008 

328121 5868031 

328190 5868077 

328227 5868105 

327871 5868131 

328018 5902482 

328624 5868246 

328744 5868290 

328799 5868316 

328291 5868363 

328329 5868375 

328369 5668388 

gg$385 5868395 

328397 5888397 

328412 5868405 

328538 5668485 

328656 6004473 

328638 6004473 

328903 5668514 

328960 6456775 

330320 5932415 

328993 5888536 

329081 5868602 

329089 5868614 

329109 5868526 

329192 5868716 

329218 5868726 

329224 5868728 

329246 5868732 

329415 5868874 

329454 5868887 



Phis 


606950-607705 


Minus 


42425-42519 


Plus 


2384268-2384835 


Minus 


3486389-3486673 


Plus 


40413184041431 


Minus 


6-1088 


Plus 


82458-62623 


Minus 


44084746 


Plus 


84317-64531 


Minus 


194652-194764 


Plus 




Plus 


145549-145708 


Minus 


57796-58015 


Minus 


97010-97123 


Plus 


68767-69126 


Phis 


141736-141900 


Plus 


93721-94421 


Minus 


92202-92717 


Plus 


85267-85405 


Plus 


4520345269 


Plus 


39824t14 


Plus 


72807*72665 


Plus 


153782-153850 


Plus 


21082-21165 


Minus 


21082-21242 


Minus 
Minus 


88889-89221 
542547-543133 


Minus 


120666-120836 


Plus 


138639-138722 


Minus 


80771-80923 


Minus 


144244-144434 


Plus 


191709-192239 


Plus 


75371-75583 


Phis 


369952-370155 


Phis 


344967-345063 


Phis 


86427-86519 


Plus 


38144243 


Phis 


792616-792729 


Plus 


294618-294903 


Plus 


23625-24468 


Phis 


38547*38837 


Minus 


54458-54697 


Plus 


49160-60084 


Phis 


93368-93510 


Phis 


25805-26923 


Phis 


102168-102273 


Phis 


166936-167020 


Minus 


71408-71707 


Plus 


27422-27664 


Minus 


250541-250792 


Plus 


1011438-1011818 


Plus 


51342-51593 



295 



WO 02/30268 



PCT7US01/32045 



TABLE 15: 169 GENES WITH SEQUENCE INFORMATION DEPICTED IN TABLE 16 

Table 15 depicts UnigenelD, UnigeneTitle, Primekey, Predicted Cellular Localization, and 
Exemplar Accession for all of the sequences in Table 16. The information in Table 15 is 
linked by EosCode to Table 16. 



10 



Ptey: 

ExAccn: 

UnigenelD: 

Unigene Tffle: 

EosCode: 

Localization: 



Unique Eos probeset identifier number 

Exemplar Accession number, Genbank accession number 

Unigene number 



Internal Eos name 

Predicted cellular localization of Bene product 



15 Ptey ExAccn UnigenelD Unigene Title 



EosCode Localization 



100394 
100452 
101249 
101485 
101514 
101851 
102398 
102522 
102689 
103119 
103709 
104080 
104144 
104691 
105370 
106149 
106579 
107102 
107217 
108153 
109014 
109112 



20 



25 



30 



35 



40 



45 



50 



55 



60 



127537 
128790 
129109 
65 129184 



D84276 
D87742 
L33881 
M24738 
M28214 
M94250 
U42359 
U53347 
U71207 



110151 
112971 
113021 
114908 
114965 
116393 
116416 



117984 
118985 
119018 
119126 
120992 
121710 
121913 
122041 
122593 
123209 
124526 



AA037316 

AA402871 

AA447439 

AA011176 

AA236476 

AA424881 

AA456135 

AA609723 

D51095 

AA054237 

AA156790 

AA169379 

H04649 

H18836 

T17185 

T2385& 

AA236545 

AA250737 

AA599463 

AA609219 

N41002 

N51919 

N94303 

N95796 

R45176 



H&241552 
Hs.1904 

Hs.123072 
Hs.82045 

Hs.183556 

H&29279 

Hs£877 

Hs.13804 

H&57771 

Hs.183390 

Hs37744 

HS22791 

H&256301 

Hs-23023 

HS30652 

Ha.40808 

H&262Q38 

H&257924 

Ha20843 

H&31608 



Hs.128836 

HS54973 

HS72472 



126645 



AA419011 

AA428062 

AA431407 

AA453310 

AA489711 

N62096 

AA126075 

A1167942 

R38438 

AA569531 

AA291725 

AA491295 

W26769 

AA621604 



Hs.45107 

Hs.106778 

Hs-55028 

H&278685 

Hs.117183 

HSS7594 



HsJ8732 
Hs.128749 
H&203270 
H&293165 

Hs*1635 

Hs.182575 

Hs.162859 

Hs.105700 

Hs.108708 

Hs.109201 



C038ant^an(p45) PBC1 plasma membrane 

WAA0268 protein PAB7 not determined 

protein kinase C, iota OAA1 cytoplasmic 

seJectin E (endothefial adhesion motecul ACC5 plasma membrane " 
RAB3B, member RAS oncogene family PFJ2 cytoplasmic 
midklne (neurfte giowtivpromotlng factor LBH9 secreted 
gbfluman N33 protein form 1 (N33) gene, PDG3 
solute earner family 1 (neutral amino a PFJ4 plasma membrane 
eyes absent (Drosophila) homotog 2 LEM9 cytoplasmic 
cadherin 3, type 1, P-cadherin (placenta LBG2 plasma membrane 
hypofheflcal protein dJ4620212 P006 
kaffikrein 1 1 PBA6 secreted 

hypothetical protein FU 13530 P0M3 
Homo sapiens beta-1 adrenergic receptor PAV1 plasma membrane 
transmembrane protein with EGF-fike and PDM9 plasma membrane 
hypomefical protein MGC13170 PD08 
ESTs PAA4 plasma membrane 

KIAA1344 protein PAA3 not determined 

DKFZP586E1621 protein PDG8 
ESTs PBF1 plasma membrane 

ESTs, Weakly similar to Z223J4UMAN ZINC POG7 
hypomefical protein FU13782 BCU4 not determined 

Homo sapiens cDNA FU1 1245 fis, clone PL PDG4 
hypothetical protein FU20041 PAV9 plasma membrane 

transmembrane, prostate androgen induced CHA1 not determined 

WAA1028 protein PD03 
cadherm-Oke protein VR20 PFJ8 plasma membrane 

ESTs BCY2 mitochondria] 

hypothetical protein MGC2648 PDV3 
ESTs OAB8 
ESTs PDT9 
ATPase, Ca++ transporting, type 2C, memb 
ESTs,WeaHysimilartol54374geneNF2 PDM8 
Homo sapiens prostein mRNA, complete cds 
ESTs PBF8 
KIAA1210 protein PDG5 
prostate androgerwegulated transcript 1 PDV5 
ESTs; protease inhfritor 15 (PM5) BCU7 
Homo sapiens Chromosome 16 BAG done CIT 
alpha-memylacyl-CoA racemase PDOI 
ESTs, Weakfy similar to ALU1JAJMAN ALU S 
ESTs, Weakly similar to JC7328 amino ad PAV4 
transmembrane, prostate androgen Induced PDY4 
six transmembrane epithelial antigen of PAA5 plasma membrane 
solute carrier family 15 (B*#eptide tra PD05 plasma membrane 
ESTs PAAB not determined 

secreted frizzted-related protein 4 8CX2 secreted 
caickjn^calmcduIIrHtepertdsrrt protein kin PFJ7 
CGW6 protein PAV6 vesicular 

spondln 2, extracellular matrix protein CJA5 not determined 
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10 



15 



20 



25 



30 



35 



55 



129404 
129534 
130750 
131425 
132984 
132957 
133179 
133330 
133520 
133724 
133724 
133944 
134110 
301805 
302005 
302881 



AA172056 

R73640 Hs.11260 
AA12B997 Ha.18953 
AA219134 Hs.26691 
AA031360 



AA032221 

U81539 

U42360 

X74331 

U07919 

U07919 



H&61635 
HS38731 
Hs.71119 
Ha.74519 
Hs.75746 
Hs.75746 



303753 
308050 



310431 
310573 
310598 
310816 
311598 
313676 
314121 
314691 
314785 
314907 
315051 
315052 
316442 
317548 
317869 
318524 
319191 
319763 
40 320324 
320561 
320796 
321441 



45 



50 324430 



322782 
322818 



324617 
324626 



324718 
330211 
330546 
330762 
330790 



AA045870 Hs.7780 
U41060 Hs.79136 
Ha.142846 
Ha.123119 
AA508353 Hs.105314 
AA340605 Hs.105887 
030891 Hs.19525 
AW503733 Ha.9414 
AM60004 HS31608 
AI734009 Ha.127699 
M420227 Hs.149358 
AW292180 Hs.156142 
A1338013 K&140546 
A1973051 H&224865 
AI682088 Hs.79375 
AA861697 Hs.120591 
Af732100 Hs.187619 
AW207206 Hs.136319 
AI538226 Hs.32976 
AIS72225 H&222886 
AW292425 

AA876910 Hs.134427 
AA760894 Hs. 153023 
A1654187 Hs.195704 
AW295184 Hs.129142 
AW291511 K&159066 
AF071538 
AA460775 Hs£295 
AF0712Q2 Hs.139338 
NM_006953Hs.159330 
AF038966 H&31218 
AW297633 Hs. 118498 
W07459 Hs.157601 
AA056060 Ha202577 
AW043782 Hs293B18 
AFD55019 H&21906 
AA639902 Hs.104215 
AI146683 Hs.143691 
AA464018 Hs.184598 
AW016378 H&292934 
AA508552 Hs. 195839 
AI685464 

AI684767 Hs.129179 
AI557019 Hs.116467 



U31382 H&299867 

AA449677 Hs. 15251 

T48536 Hs.122784 

60 330892 AA149579 H&91202 

331099 R36671 Hs. 14848 

331490 N32912 Hs291Q39 

331889 AA431407 Hs.98802 
332247 N58172 
65 332396 AA340504 
332697 T94885 
332798 
334447 



ESTs PAB4 
hypothetical protein FU11264 PAJ3 
phosphodiesterase 9A PEES 
ESTs PBA7 
ESTs PAA7 
six transmembrane ep&hallal antigen of PM17 
horn eo box B13 PFJ5 
Putative prostate cancer tumor suppresso PDM1 
primase, polypeptide 2Af58kO) PDM2 
aldehyde dehydrogenase 1 family, member 
aldehyde dehydrogenase 1 family, member 
Homo sapiens mRNA; cDNA DKFZp564A072 (fir 
UV-1 protein, estrogen regulated BCR4 
hypothetical protein PEU4 
MAD (mothers against decapentaplegte, DrPBJ6 
relaxin1(H1) PBH3 
ESTs, Weakly similar to Homolog of rat Z PE64 
hypothetical protein FLE2794 PBM4 
WAA1488 protein PBY3 
hypothetical protein RJ20041 PEU5 
K1AA1603 protein PCQ8 
ESTs, WeaWy similar to A46010 X-Crtked PBH1 
ESTs PEN3 
ESTs PCW3 
ESTs PET5 
hotocarboxylase synthetase (bfotirvfprop PBH8 
ESTs PBY2 
ESTs PBY1 
ESTs BFF8 
guanine nucleotide binding protein 4 CB07 
ESTs, Weakly similar to TRHY_HUMAN TRiCH 
ESTs P8M9 
ESTs PBJ7 
ESTs PBJ9 
ESTs PBQ6 
deoxyribonudease 11 beta PBQ7 
hypoflieticaJ protein FU10188 PBJ1 
prostate epitrWium-specifc Els transcr PEN1 
ESTs, WeaMy similar to T17248 hypothefi PE07 
ATP-ttiding cassette, subfamily C (CFTR PBH5 
uroplakin3 PELS 
secretory carrier membrane protein 1 PBY4 
Homo sapiens LUCA-15 protein mRNA, splic 
ESTs 

Homo sapiens cONA FU12166 fis, clone MA 
ESTs PCQ7 
Homo sapiens done 24670 mRNA sequence 
ESTs, Moderately stmSar to SPCNJ4UMAN S 
ESTs PBQ9 
Homo sapiens cONA: FL23241 As, clone C 
ESTs PBM3 
ESTs, Weakry similar to 138022 hypotheti PBH4 
gb:tt88f04XI NCl_CGAP_Pr28 Homo sapiens 
Homo sapiens cDMA RJ 13581 fis, done PL 
smaD nuclear protein PRAC CBK1 

PBJ2 

guanine nucleotide binding protein 4 PEW4 
hypothetical protein PBM1 
TMPRSS2, transmembrane protease, serine 
ESTs PBQ4 
Homo sapiens mRNA; cDNA DKFZp564O016 (fr 
ESTs PCI4 
ESTs, Moderately sMartoT14342NSDl PBH7 
gbaa21 ID9.S1 Scares fetal Over spleen PBQ5 
gbJtw31a09Jc1 NCLCQAP_KkJ1l Homosapien 
transgelin2 PBQ8 
. PBH2 
PBY9 
PBY7 



nuclear 

plasma membrane 



nuclear 

plasma membrane 

PDT1 mitochondrial 
POT1 mitochondrial 
PAB9 cytoplasmic 
plasma membrane 
nuclear 
cytcpiasmte 



not determined 
not determined 
plasma membrane 

plasma membrane,, 
plasma membrane 



not determined 
cytoplasmic 
PBM2not determined 

plasma membrane 



cytoplasmic 



plasma membrane 
plasma membrane 
not determined 
PBY8 not determined 



PBQ1 not determined 
plasma membrane 
PCC not determined 
PBJ5 

not determined 
PBY6 not determined 

cytoplasmic 
-PCW6 

PBJ4 plasma mernbrane 
nuclear 

net determined 

cytoplasmic 

not determined 

PEL3 plasma membrane 

plasma membrane 

PCQ1 cytoplasmic 

nuclear 

not determined 



PBJ8 not determined 

secreted 

nuclear 

not determined 

not determined 
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401424 
407122 
408430 



409361 
4110% 
413125 
413623 
414422 
415263 
417153 
418601 



418882 
419839 
421887 
422083 



10 



15 



20 



25 



30 



35 



40 



451939 

451982 

45 452039 A022988 
452340 
452784 



425071 
425710 
427958 
428819 
429900 
429918 
430226 
431217 
431716 
431992 
432189 
432244 



432966 
439176 
440260 
440901 
445424 
446320 
447210 
449156 
449625 



PFG2 

H20278 Hs31742 ESTs PEW7 
S79876 H&44926 dipeptidyfcepiidase IV (C026, adenosine PEZ3 
AF216077 H&48376 Homo sapiens clone HB-2 mRNA sequence 
AK000631 H&52256 hypothetical protein RJ20624 PFG1 
N^0Q5982Hsj54416 slna ocuiis homeobox (Drosophlla) homok) PEW3 
U80034 H&68583 mitochondrial intermediate peptidase PEZ9 
BE244589 Hs.75207 glyoxalasel PFJ3 
AA825721 H&246973 ESTs OBH6 
AA147224 Hs£37232 HomeoboxA13 PFC6 
AA948033 Hs.130853 ESTs PEZ5 
X57010 HsJ1343 "coEJagen, type II, alpha 1 (primary est PFJ1 
AA279490 H&86368 catmeg'n PFA1 
AI820961 Hs.193465 ESTs PEY4 
NMJ04996H&89433 ATF-btndlng cassette, sub-tamfly C (CFTR OBH2 
U24577 H&93304 "phosphorpase A2, group VII (platelet* PFH9 
AW161450 Hs.109201 CG1-86 protein PFH2 
NALQ01 141 Hs.1 11256 "arachidonate 15-Epoxygenase, second ty PFH5 
AW102723 Hs,75295 guanylate cyclase 1 , soluble, alpha 3 PFA3 
NM.013989H&154424 "dekxftiase, lodothyronme, type U" PFH6 
AFD30880 solute carrier famOy, member 4 PFD4 

AA418000 Hs.98280 potassium fntermediate/smafl conductance PFH1 
AL1 35623 Hs.193914 KIM0575 gene product PFD6 
AA460421 H&30875 ESTs PE27 
AW873986 H&119383 ESTs PEY5 
BE245562 H&2551 adrenergic, beta-2-, receptor, surface PEZ4 
NMJQ13427H&250830 Rho GTPase activating protein 6 PFG6 
D89053 H&268012 fafly^acld-Coenzyme A Bgase, long-chain PEZ1 
NMJXK742H&2891 protein kinase C, mu PFH4 
AA527941 gb:nh30c04.s1 NCI_CGAP_Pr3 Homo sapiens 

AI669973 H&200574 ESTs PEWS 
W07088 H&293685 ESTs PFG3 
AA650114 Hs.325198 ESTs PEY3 
AI446444 Hs.1 90394 ESTs, Weakly similar to B28096Bne-1 pr PEWS 
A1972867 Hs.7130 copinelV PEW8 
AA909358 Hs.1 286 12 ESTs PFC8 
ABQ28945 cortactin SH3 domain-binding protein PEZ6 

AF126245 Hs.1 4791 "acyl-Coenzyrne A dehydrogenase famSy, m 
AF035269 phosphatkiylserine-spetific phosphoCpas PFH8 

AF1 03907 Hs.171353 prostate cancer antigen 3, non-coding DO PEZ8 
NM.014253 odz (odd Ozfterwn, Drosophlla) homolog 1 PEZ2 

AF055575 Hs£3838 calcium channel, voltage-dependent, L ty PFD2 
U80456 H&27311 single-minded (Drosophfc) homolog 2 PFJ8 
F13036 H&27373 Homo sapiens mRNA; cDNA DKFZp56401763 (f 
ESTs PFD8 
NMJ»2202Hs.505 ISL1 transenpflon factor, UMtoneoooma PFG4 
BE463857 Hs.151258 hypothetical protein FU21062 PFC5 
X95425 HS31092 EphA5 PFH3 



mitochondrial 

plasma membrane 

PEY1 

nuclear 



cytoplasmic 



ER 



secreted 

plasma membrane 
cytoplasmic 



plasma membrane 



plasma membrane 
nuclear 

cytoplasmic 
PFA2 



PFH7 



plasma membrane 
plasma membrane 

PFG9pIasma membrane 

nuclear 
cytoplasmic 
plasma membrane 
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TABLE 15A shows the accession numbers for those primekeys lacking a unigenelD in Table 
IS. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene cluster number 

Accession: Genbank accession numbers 



Pkey CAT number Accession 



116393 131543J AJ972402 A16344Q9 A1523716 AI799749 W44518 A1424438 AI688513 AI971048 AI686324 AW013854 AA588483 AA528111 AI627428 
AI582200 AK69286 A1826926 AI620526 A1569958 A1972458 AI 924500 AA512903 W44517 AA335363 AW238997 BE3001 65 
BE250665 AA284195 AA523420 W52834 AI471970 AI952824 AW003820 AW009463 AA669798 AA1 14966 A1653342 AA1 15038 
AI342150 AI092100 AI96821 1 W51094 AI804005 A1201420 AI123210 AT738405 AI574964 AI97Q341 AWQ27500 AI493316 AB33193 
AJ139353 AA599463 AI656163 A1804200 AI365321 AI9BQ213 AIS57011 AA650025 AI968810 AI341978 AA599839 AW592602 
AA644289 AI468578 AI565265 AI565228 BE221535 AW973052 
101485 18113J AA296520 AL021940 M30640 NM.000450 M24736 M61894 AL047443 H39560 AI594691 AA916787 AI214796 AA939085 A1150616 

AA412553 AA412545 A1051015 T?7654 AA694430 
126399 17331.1 AA088767 AF224278 AA128075 AL035541 AA027926 AI761441 A1972096 AW071693 AI742327 AI377498 AIB04615 AJ 640802 

AIB85001 AI921394 AA5951 15 N71B20 A1921217 AW007283 AI467828 AJ369306 AAA 17446 A1493598 AA088701 AA126899 AI93622B 
AW204238 AI039567 AI925027 BE138909 AW452945 AW135998 AA310984 AA027860 AW073519 A1537597 AA953976 AI521341 
AW273569 AW050740 AA536113 AA559064 AI474392 AW135709 AA535181 AW572959 AA570597 AI905464 AI877810 AI587642 
AW975102 AA424310 AA482527 N64192 AA658278 AW889117 AA486591 AW889172 AJ381990 AI381991 A1673419 AI990950 
AA487031 AI272934 AI150565 AA2291 68 AW316722 A1142707 BE222396 AA6141 68 AA1 22026 AW338227 AA632457 AI968726 
AW369662 AA512956 AA541675 AA451748 AJ250993 BE146418 AA122025 
94346.1 AB62575 AIB05082 AW263421 AI432462 AA135870 AA031360 AA031604 AA298475 AA298464 

21074.1 NM.012445 ABQ27466 BE407510 BE047605 AA047125 AW084003 AA149494 AA149490 AA292528 AA570505 AA526186 AW006250 
AW007762 A1341557 AI799668 A1972710 A1377966 AJ962810 AI084783 AI458032 AI190971 AW148913 AA372354 AW970032 
AW007426 AA650188 A1123203 AI122890 AI280975 W73595 W73495 AI863238 AA374109 AA603988 AW149089 AW957523 
AI307748 AI921067 AJ336463 F24537 A1380460 Ai367500 All 893 09 AI814701 AJ766921 AW572106 AA037Q24 AW072576 AA578293 
AI288103 AA235464 AW450642 AA574230 AW294024 A1589229 AI580733 AW512227 AAB77009 AI660255 AW188597 AA558228 
A1572782 AA658397 A1274628 AI888359 AA8S4573 A(264439 AA621 604 AW515493 AW243333 Z39737 A1567038 AA573997 
AA573559 AW238431 AI652870 AI584973 AA034505 AA047126 
156454 J AI267700 AI720344 AA191424 AI023543 AJ469833 AA172056 AW958465 AA172238 AW953397 AA355086 
9836.1 AL08Q235 AA031 750 D81382 AI480231 A1095947 AI560953 BB010721 AI87Q290 AA374945 AA125792 D51527 051556 AI685541 
D51559 AW1 17286 AA19S741 AI6751 38 AW593439 AI201885 T30590 AW952100 D51095 AA523864 W70043 AA987586 A1421515 
AI205532 AA127069 A1337387 051595 A1453785 AW075677 AW088359 C14237 C14284 
121710 19268.1 AF163474 NM_016590 AF163475 AI781 105 AJ770098 AA410580 AA41161 6 AI590343 AT739050 ALD50198 AI862645 AA419104 
AA513809 AA333032 AB16915 AW139625 AA64Q889 AI311391 A1627693 AW135514 AA41901 1 AJ269149 AI245259 AI970008 
AI970017 AW139445 AA569503 AI761072 AI7661 79 AI759995 AI300776 AI870129 AW150770 AA226501 AA226220 . 
121913 291015.1 AI249368 AI742316 AA428062 AA442089 A1864189 BE349478 A1803475 A1584049 BE552085 AI088609 AB64197 AK88144 A1129474 

A1307145 BE181 300 AW058403 AI696838 AW748598 AA4421 96 AI216428 
102398 entrez_U42359U42359 

315051 347217.1 AW292425 BE467167 AI7Q2853 BE550961 BE222309 AI299348 AI693336 AA541708 
324626 338411 J AI685464 AW971336 AA513587 AA525142 

319191 16065.1 NM.012391 AF071 538 AB031549 A1685592 AI745526 AA662204 AW1 30657 AA862164 AW971121 AI668916 AA513274 AI991223 
AI979170 AW298436 AA639821 AI85901 0 AW513942 AI687669 AA662521 AA548598 AI345056 A!305374 BE043418 AI432856 
AJ334840 AB79796 AW82693 AI307915 BE042082 AI307834 A1307858 AI309488 BE042210 AI435670 AI371605 A1862491 AI264563 
AI306872 AI255044 AI254601 AI251238 A1473073 A1473042 AI432760 A1435664 AI336826 AI289365 AI369096 AI8Q2274 A1334871 
AI349863 AI250405 AI377617 A1309895 AI313017 AI862291 AI311936 AB78718 AI305722 AJ306769 AI308888 AI334565 AI862296 
AI344230 A1435685 AI344087 AI37B696 AI311209 A1435775 AQ10611 AB11154 AI432289 AM31561 AI492681 AI432867 AI335288 
A1432796 A1432769 AI3102S9 AW32273 A1379820 AB75319 AM35753 AI609441 AI43Z7B7 AI3891 00 A131 1420 AI349974AI247157 
AB34677 A1270910 AE24320 A1305608 A1334489 AI377152 AK550012 AI370088 AI335053 AI306781 AI306750 AB34849 AI334874 
AI340380 A1307876 AI305974 AI305972 AI311521 AB34872 AD3625Q9 AI31 1498 AI335051 A1289684 AQ10859 AB1 1862 AI862483 
AI492775 AI307906 AI492708 AI289693 AI340373 AQ0791 0 AI31 1359 AM35653AI334865AI31 1492 AI492809A(492690AI4315re 
A1862268 AI31 1879 AB08435 AI492792 AI862512 AI275321 AI431568 AI431564 AI307885 AC07926 A1435692 AI435778 AI31 0182 
AB08894 A1492707 A1492713 AI308560 AI307829 AI343234 AI580598 AW4727B6 AB40918 AB1Q243 AI309368 A1307B20 A1289665 
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107217 
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10 



15 



20 



330211 
332798 
334447 



25 332247 372969J 
332336 2Q265L.1 



30 



35 



40 



45 



50 



55 



60 



65 



332697 13899J 



425710 
432189 
445424 



447210 7119J 



449825 8113J 



452039 89513J 



AI306777 AW08631 8 AW086282 AW0B5378 AB10027 AI275293 AI369082 A1340900 AI306749 A1371558 AW086287 BBM3803 
AI306793 AI306272 A1287948 AI270917 AI284816 AJ33681 3 A1284546 AI308O44 At275290 AE70872 A1305795 AI289687 AI223570 
AI305303 A12B9S77 AI287742 AE75284 AI306812 A1336701 AI371554 AI378719 AI344988 AE23631 A1335141 AI343222 AI284568 
AI305357 A1275270 AI345932 AM36549 A1307925 AJ311502 AI344238 AI3431 82 A1308503 A1305988 AI270790 AI379782 A1305647 
AI305410 AJ432251 AI436517 A1343227 AI305534 AI340387 A1271043 AI305499 A1271 046 A1305962 AEB9465 AI305378 AI289725 
AJ310848 AI305848 AI289362 AI252964 AI307049 A1310831 A1306983 A1306796 AI224659 A1305969 AJ348855 A1306164 A1306948 
AI284676 AJ309155 A1343202 AI432785 A1306815 AI369081 A1270885 AI289699 AI435704 A1309647 AI305716 AI31 1281 A1287927 
AI472995 A1340423 AJ270958 AI307069 AI305364 AJ27D307 A1275306 A131 1 890 AI275263 AW32750 A12B9371 AI432861 A1255113 
AI305709 AM73008 AI31116B A130971 1 AI377164 AI271201 A1289560 A1309710 A1306195 AI31 1201 A1287741 A1271 066 A1432876 
AI2752B1 AG797B5 A1472972 A131 1967 AI306826 AI305465 A1270782 AJ473019 AB05340 AI270922 AI305995 A1305462 A1254144 
A1270969 AI473012 AI305390 AJ275278 AI223644 AI289692 A1250318 A1305372 AI289G91 AI250521 AI306283 A806B14 AI307933 
AW73160 A1432903 AI223720 AI254979 AI3348S2 A1306926 AE89541 AI432248 A1435722 AI435698 A1432B59 AI310683 A1473175 
AI335144 A1289467 A1436489 A13Q5928 AJ473033 A1305763 A1307868 A1307882 A1348959 AI435736 AJ432857 AW32E96 A1435735 
AI432283 AW7308S A1432863 AM73C81 A1432825 AI307840 AM73164 A1432B85 A1473166 A1472982 AW35734 AM73060 AI473171 
A1432279 A1432882 AI334670 AI438512 A1432827 AI432852 AW73051 A1473077 A1435697 AI271509 AM92781 AI472933 AW73018 
AW32897 AI473043 AI432871 AI436538 AI473157 AI349715 AI432777 AI473016 A1473158 A1340369 AK307941 AI432773 AJ377146 
A1492791 AI270950 AI305342 AE84604 AI306269 AI28481 1 A127081 1 AI289347 AI334869 AI334852 AB11759 AI250382 AI309520 
AI289550 ABQ5721 AI340870 A12709D1 A1308575 AI307904 AI340715 AE70941 A1309808 AI246867 AI473014 A1307039 A1289360 
A1473069 AM92786 AI34401 3 A1305876 AM36510 AI340742 A1473028 AI307891 BE041871 BE041288 BE042340 BE041946 
BE041783 AI306173 AI201948 A1926972 AI275769 

CH22„6856FCL_UNieEM^C00 

C_5_p2 

CH22 J4FG.6 JJUNK.C4G1 X3 
CH22_1746FGL387J7JJNK^M 

AA669097 AA5 13815 AA026798 AA676526 AA704429 AA704269 AW1 18232 AA57S216 N58172 

AW579842 BE156S62 BE156690 BE156489 BE081033 AK001559 BE149402 M85387 AW367811 AW367798 R17370 A1908947 
AA382932 R58449 H18732 AA371231 AW9S2899 AA713530 AW892946 R534S3 H110B3 AWD68542 Z40761 BE178212 BE176155 
W23952 W92188 AW374883 AA303497 AW954769 AA036808 BE168063 AW382073 AW382085 AL041475 H80748 AI078161 
BE463983 AI805213 AI761264 W94885 N945Q2 AI623772 A1419532 AI810302 AI834190 AW002516 A W1 50777 AI352312 A1367474 
AW204807 AI875502 AI337026 AW134715 BE328451 AJ 1231 57 AI560020 AI300745 A1608631 A1248873 AA742484 AW051635 
H18646 AI245045 AA507111 AI54051 0 AI925594 AA1 15747 AA143035 AA151108 

X51405 NM.001873T11322 AL118886 BE328175 AW136009 BE467445 AW470313 AA774852 BB04139 AW501046 AA082792 
AW389231 AA370044 R38841 AA371457 004813 R25791 R25556 AW895854 AW903819 AW895671 AW895677 BE159723 
AWB95S64 AW895597 AWB95595 AWB95665 AWB88516 AI903724 F06081 F08503 AL119482 AW895730AW888516 R26511 
R26489 AA334126 AA327626 N85713 AW895998 AA223Q22 F05468 AA370749 W05590 M78202 AA371073 AW498607 R15017 
T16991 AAD01282 AA001 138 AA551566 AA330159 AI922855 AA383512 AAQ29603 D82246 082171 TB4933 K56545 AA343060 
AA1 7B888 R9B764 AW451817 AA385766 AA45281 8 AI690057 AA888822 BB49928 AA150901 W57992 AW899925 C05281 
AA932042 AA370980 AW962877 W04741 AA369982 AW385948 AA922466 N75882 AI422070 A1361256 AI680224 D57122 "T94885 
R53266 R46713 T19071 AW796277 AA325333 F04719 F02334 AA356146 AA626597 AA358304 AW028099 AL1 19570 057290 
058273 057796 N48555 AI351969 AAS28457D57225 AWQ24046 AA992606 AWSQ22118 AWD21538 AA835645 HB9870 K56546 
AW951219 AA453239 AW837541 N45521 BE21 8029 AA318877 AA327740 AW961 809 T92139 D53216 D52365 053363 053312 
0531 16 AI547267 AA676935 AW026552 AW02641 8 AW1 90507 AI92771 0 AW2441 08 050948 AW054991 AW021 063 AW02251 1 
AA493436 AI365638 BE464751 AW149384 AA102442 AW771368 AT818251 AI126368 D51049 AI421542 AI559467 AW079779 
AW021048 AW023969 AW044214 AI458264 AA0Z7274 AI620254 AW028917 BE21951 1 AA326242 N67561 AI971273 AA87B328 
D57131 AA770662 A1309299 AI796767 AA613338 W58078 A1566287 A1445573 AI880260 AA001 91 9 AW339259 AM92610 AI49281 1 
R97692 AI301425 AA722603 D58361 A1350323 AA973926 AI431263 AA516126 AA865467 AI925177 N39443 AA001943 A1299371 
AKB2412 AA665090 AA583433 H89871 AA977231 AJ362219 AI055096 A1270446 N67524 N22103 AW614224 AA744054 AW243622 
AI613188 AI929173 A1350243 AJ3621 38 AA744004 AA176661 056787 A1955625 AI393109 AI094769 AW79728 AI423107 Al 95561 7 
AI034038 AI582196 AW264534 AI418961 AA570761 AJ343538 AA650341 AA992503 AA770004 AL039668 AI862675 AW1S0335 
AA610274 AW418627 BE467472 D56786 728749 AE17610 AI359556 T23523 AL040189 AA846222 AA651636 D512B0 AI888986 
AI521167 AI340177 AW612815 AI625285 AA621607 AA1 77059 AA229768 AA829788 A1749682 AW190631 N75299 AA230C89 
AI915632 BE069542 AA890020 AA528397 AA995390 BE503860 AA570812 AW339396 AM97986 AI203725 A1282379 AA670375 
AA461513 F01728 AW243599 C00856 N75567 R95995 AA150932A95961 AA648060 AA933800 AA927073 AA101126 AA864190 
T83566BE167472 

AF030880 NM.000441 AC002467 AA385554 H23053 AW891838 AI139968 AA653057 AI695233 
AA527941 A1810608 AI620190 AA535266 

AB028945 T77648 F13328 AL157605 Z46212 AA304736 F1 1855 T66098 T30174 AW954164 AW176301 AW748243 AA456428 
AI389958 AA938565 AW959613 Z42008 AA994779 A! 683309 F11019 F10928 AI769597 AJ752550 T65015 AI684314 AA643954 
Z41838 AW020147 AI038822 AW571822 AA299781 AA894828 AF131790 BE00541 1 AI902476 AW082695 AA464384 R42750 
AW902301 AA464273 R05837 Z38294 H41098 AL134507 M86079 

AF035269 AF035268 NMJ315900 T96213 U37591 AA156832 AA2S9371 AI084325 H95977 AI765967 BE221465 AA156726 AB69563 
AWQ24539 AI436791 AI949451 AA843093 A1452756 AA824232 A1306667 T96131 AW207447 AW243556 AW957032 AJ084332 
H95978 U 309 98 

NM_014253 AF100772 BE088769 AL022718 BE161779 AW863569 BE161640 AL039060 BE168542 AW298554 AA323193 AA235370 
AW779760 N48674 AI375997 R45432 059344 AEQ3107 F07491 R35380 R25094 AI913S31 A1498402T61382 AI016320 N45528 
T61415AA331486 

AI922988 K05475 AA021608 AW169947 AA913750 Z41614 AW800012 



342819J 
6391J 
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TABLE 15B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 15. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
5 listed. 

Pkey: Unique number corresponding to an Eos probeset 

Refc Sequence source. The 7 digit numbers fn this column are Genbank Identifier (Gl) numbers. Tunham L et aJ." refers to the 

publication entifled The DNA 

10 sequence of human chromosome 22." Dunham I. e! aL, Nature (1699) 402:489-495. 

Strand: Indicates DNA strand from which exons were predicted. 

NLposffion: Indicates nucleotide positions of predicted exons. 

IS Pkey Ref Strand NLpositJon 

334447 Dunham, LetaL Plus 14308764*14308824 

332788 Dunham, Letal. Minus 232147-231974 

338258 Dunham, LetaL Minus 15242294-15242231 

20 330211 6013592 Plus 59158-59215 

401424 8176894 Plus 24223-24428 
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TABLE 11 AND SEQUENCE LISTING 

SEOtDNOn BCU4 ONA SEQUENCE 

Nuctete Add Aoesston «: NM.Q24815 

Cotfing sequence: 13-1880 (undefined sequences correspond to start and stop codons) 

1 It 21 31 41 51 
I I I I I I 

ATTGGATCAA ACATGTCACA AGAGTCGGAC AATAATAAAA GACTAGTGGC CTTAGTGCCC 60 
ATGCOCAGTO ACCCTCCATT CAATACCOGA AGAGCCTACA CCAGTGAGGA TGAAGOCTGG 120 
AAGTCATACT TGGAGAATCC CCTG ACAGCA GCCACCAAGG OCATG ATGAT CATTAATGGT 180 
GATGAGGACA GTGCTGCTGC OCTOGGOCTG CTCTATGACT ACTACAAGGT TCCTCGAGAC 240 
AAGAGGCTGC TGTCTGTAAG CAAAGCAAGT GACAGCCAAG AAGACCAGGA GAAAAGAAAC 300 
TGCCTTGGCA CCAGTG AAGC CCAGAGTAAT TTGAGTGG AG GAG AAAAOOG AGTGGAAGTC 360 
CTAAAGACTG TTCCAGTGAA OCTTTOOCTA AATCAAG ATC ACCTGGAGAA TTOCAAGOGG 420 
GAACAGTACA GCATCAGCTT OOCGGAGAGC TCTGOCATCA TQCOGGTGTC GGGAATCAOG 480 
GTGGTGAAAG CTGAAGATTT CACACCAGTT TTCATGGCCC CACCTGTGCA CTATCCCOGG 340 
GGAGATGGGG AAG AGCAACG AGTGGTTATC TTTGAACAGA CTCAGTATGA CGTGCCCTCG 600 
CTGGCCACCC ACAGOGCCTA TCTCAAAG AC GACCAGCGCA GCACTGOGGA CAGCACATAC 660 
AGCGAGAGCT TCAAGG ACGC AGCCACAGAG AAATTTCGGA GTGCTTCAGT TGGGGCTGAG 720 
GAGTACATGT ATGATCAGAC ATCAAGTOGC ACATTTCAGT ACACCCTGGA AGCCAOCAAA 780 
TCTCTCCGTC AGAAGCAGGG GGAGGGOCCC ATGACCTACC TCAACAAAGG ACAGTTCTAT 840 
GCCATAACAC TCAGCGAGAC CGGAGACAAC AAATGCTTOC GACACCCCAT CAGCAAAGTC 900 
AGG AGTGTGG TGATGGTGGT CTTCAGTGAA GACAAAAACA G AGATGAACA GCTCAAATAC 960 
TGGAAATACT GGCACTCTCG GCAGCATACG GCGAAGCAGA GGGTCCTTGA CATTGCCGAT 1020 
TACAAGGAGA GCTTTAATAC G ATTGGAAAC ATTGAAGAGA TTGCATATAA TGCTGTTTOC 1080 
TTTACCTGGG ACGTGAATOA AGAGGOGAAG ATTTTCATCA COGTGAATTC CTTGAGCACA 1140 
GATTTCTCCT OOCAAAAAGO GCTCAAAOOA CTIOCTXTO A TGATTCAGAT TOACACATAC 1200 
AGTTATAACA ATCGTAGCAA TAAAOCCATT CATAG AGCTT ATTGCCAGAT CAAGGTCTTC 1260 
TGTGACAAAG GAGCAGAAAG AAAAATCCGA GATGAAGAGC AGAAGCAGAA CAGGAAGAAC 1320 
GGG AAAGGCC AGGCCTCCCA AACTCAATGC AACAGCTCCT CTGATGGGAA GTTGGCTGCC 1380 
ATACCTTTAC AG AAGAAGAG TGACATCACC TACTTCAAAA CCATGCCTGA TCTCCACTCA 1440 
CAGOCAGTTC TCTTCATACC TG ATGTICAC TTTGCAAAOC TGCAGAGGAC CGGACAGGTG 1500 
TATTACAACA CGG ATGATG A AOGAGAAGGT GGCAGTGTCC TTGTTAAACG GATGTTCCGG 1560 
OOCATGGAAG AGGAGTTTGG TCCGGTGCCT TCAAAGCAGA TCAAAGAAGA AGGGACAAAG 1620 
OGAGTGCTCT TGTACGTGAG GAAGGAGACT GAOGATGTGT TOGATGCATT GATGTTGAAG 1680 
TCTOCCACAG TGATGGGCCT GATGGAAGCG ATATCTGAGA AATATGGGCT GCOOGTGGAG 1740 
AAGATAGCAA AGCTTTACAA GAAAAGCAAA AAAGGCATCT TGGTGAACAT GGATG ACAAC 1800 
ATCATQG AGC ACTACTDGAA CO AGG ACAOC TTCATOCTCA ACATGG AG AG CATGGTGG AG 1860 
GGCTTCAAGG TCACGCTCAT GGAAATCXAS CXXTGGGTTT GGCATCCGCT TTGGCTGGAG 1920 
CTCTCAGTGC GTTCCTCCCT GAGAGAGACA GAAGCCCCAG CCCCAG AACC TGGAGACCCA 1980 
TCTCCCGCAT CTCACAACTG CTGTTACAAG AOCGTGCTGG GGAGTGGGGC AAGGGACAGG 2040 
CCCCACAGTC GGTGTGCTTG GCCCATCCAC TGGCAOCTAC CACGGAGCCG AAGCCTGAGC 2100 
COCTCAGG AA GGTGCCTTAO GOC1G T TGGA TTCCTATTTA TTGCCCACCT 1T1UJTUGAQ 2160 
OCCAGGTOCA GGCCOGCCAG GACTCTGCAG GTCACTGCTA GCTCCAG ATG AGAOCGTCCA 2220 
GCGTTCCCCC TTCAAG AGAA ACACTCATCC CGAACAGCCT AAAAAATItX CATCCCTTCT 2280 
TTCTCACCCC TCCATATCTA TATCTCCCGA GTGGCTGGAC AAAATGAGCT ACGTCTGGGT 2340 
GCAGTAGTTA TAGGTGGGGC AAGAGGTGGA TGCOCACTTT CTGGTCAGAC ACCTTTAGGT 2400 
TGCTCTGGGG AAGGCTGTCT TGCTAAATAC CTOCAGGGTT CCCAGCAAGT GGOCAOCAGG 2460 
CCTTGTACAG GAAGACATTC AGTCACCGTG TAATTAGTAA CACAGAAAGT CTGCCTGTCT 2520 
GCATTCTACA TAGTGTTTAT AATATTGTAA TAATATATTT TAOCTGTGGT ATGTGGGCAT 2580 
GTTTACTGOC ACIGGCCTAG AGGAGACACA GAOCTGGAGA CCGTITT AAT GGGGGTTTTT 2640 
GCCTCTGTGC CTGTTCAAGA GACTTGCAGG GCTAGGTAGA GGGOCTTTGG GATGTTAAGG 2700 
TG ACTGCAGC TGATGCCAAG ATGGACTCTG CAATGGGCAT AOCTGGGGGC TCGTTCCCTG 2760 
TCCCCAOAGG AAGCCCCCTC TCCTTCTCCA TGGGCATGAC TCTOCTTOGA GGCCACCAOQ 2820 
TTTATCTCAC AATGATGTGT TTTGOCTGAC TTTOCCTTTG CGCltjTCTOG TGGGAAAGGT 2880 
CATTCTGTCT GAGACCCCAG CTCCTTCTCC AGCTTTGGCT GCGGGCATGG CCTGAGCTTT 2940 
CTGGAG AGCC TCTGCAGGGG GTTTGCCATC AGGGCCCTGT GGCTGGGTCT GCTGCAGAGC 3000 
TOCITGGCTA TCAGGAG AAT CCTGGACACT GTACTGTGOC TCOCAGTTTA CAAAQACGGC 3060 
CTTCATCTCA AGTGGCOCTT TAAAAGGCCT GCTGCCATGT GAOAGCTGTG AACAGCTCAG 3120 
CTCTGAGTCG GCAGACTGGG GCTTCCTGCT GGGCCACCAG ATGG AAAGGG GGTATTGTTT 3180 
GOCTC ACTOC TGGATGCTGC GTTTTAAGGA AGTGAGTG AG AAAGAATGTG CCAAGATACC 3240 
TGGCTCCTGT GAAAOCAGCC TCAGGAGGGA AACTGGGAGA GAG AAGCTGT GGTCTCCTGC 3300 
TACATGCOCT GGGAGCTGGA AGAGAAAAAC ACTCCCCTAA ACAATCGCAA AATGATGAAC 3360 
CATCATGGGC CACTGTTCTC TTTGAGGGGA CAGGTTTAGG GGTTTGCG1T CGOCCTTGTG 3420 
GGCTGAAGCA CTAGCTTTTT GGTAGCTAGA CACATCCTGC ACCCAAAGGT TCTCTACAAA 3480 
GGCCCAG ATT TGTTTGTAAA GCACTTTGAC TCTTACCTGG AGGCCCGCTC TCTAAGGGCT 3540 
TCCTGCGCTC CCAOCTCATC TGTCCCTGAG ATGCAGAGCA GGATGGAGGG TC1GC1TCTA 3600 
GCTCAGCTGT TTCTOCTTG A GGTTGGGG AG GAATTG AATT GAATGGGACA GAGGGCAGGT 3660 
GCTGTGGCCA AGAAGATCTC CGAGCAGCAG TG ACGGGGCA CCTTGCTGTG TGTCCTCTGG 3720 
GCATGTTAAC CTTTCTGTGG GGCCAAAGGT TTGCATCGTG GATCCAGCTG TGCTCCAGTC 3780 
TGTOOCCTOC TOCTCCACTC TG ACTGOCAC GOOOOGGAOC AGCAGCTTGG GGACOCTCCA 3840 
GGGTACTAAT GGGGCTCTGT TCTGAGATGG ACAAATTCAG TGTTGG AAAT ACATGTTGTA 3900 
CTATOCACTT COCATGCTCC TAGGGTTAGG AATAGTTTCA AACATGATTG GCAG ACATAA 3960 
CAACGGCAAA TACTCGOACT GGGGCATAGG ACTOCAGAGT AGGAAAAAGA CAAAAGATTT 4020 
GGCAGCCTGA CACAGGCAAC CTACOCCTCT CTCTOCAGOC TCTTTATGAA ACTGTTTGTT 4080 
TGOCAGTOCT GGOCTAAGGC AGAAGATGAA TTGAAGATGC TGTGCATGTT TCCTAAGTCC 4140 
TTGAGCAATC ATGGTGGTG A CAATTGOCAC AAGGGATATG AGGOCAGTGC CAOCAG AGGG 4200 
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TGGTGOCAAO TGCCAGATOC CTTOCGATCC ATTCOCCTCT GTATCCTOGG AGCACOOCAG 4260 
TTTGCCTTTO ATQTGTOOGC TGTCTATGTT AGCTGAACTT TGATGAGCAA AATTTCCTGA 4320 
GCGAAACACT CCAAAGAQAT AGGAAAACTT GOOGCCTCTT CnTTTTOTC CCTTAATCAA 4380 
ACTCAAATAA GCTTAAAAAA AATCCATGGA AGATCATGGA CATGTGAAAT GAGCATTTTT 4440 
T T C 1 11 ICTT TTTnTTTTTTTTm I A AC AAAGTCTGAA CTGAACAGAA CAAG ACTTIT 4500 
TCCTCATACA TCTOCAAATT GTTTAAACTT ACTTTATGAG TG 1 1 TOTTTA GAAGTTCGGA 4560 
CCAACAG AAA AATGCAGTCA GATGTCATCT TGGAATTGGT TTCTAAAAGA GTAAGGCATG 4620 
TCCCTGCCCA GAAACTTAGG AAGCATGAAA TAAATCAAAT GTTTATnTC CTTCTTATTr 4680 
AAAATCATGC TAATGCAACA OAAAT AG AGO GTTTOTGC CA AATGCTATGA ACGGCOCTTT 4740 
CTTAAAG ACA AGCAAGGGAG ATTGATATAT GTACAATTTG CTCTCATGTT TTT 

Protein Accession #: NP.079191.1 

l II 21 31 41 51 
t I I I I I 

MSQESDNNKR LVALVPMPSD PPFNTRRAYT SEDEAWKSYL ENPLTAATKA MMHNGDEDS 60 
AAALGLLYDY YKVPRDKRLL SVSKASDSQE DQEKRNGUGT SEAQSNLSGG ENRVQVLKTV 120 
PVNLSLNQDH LENSKREQYS ISFPESS AH PVSGITWKA EDFTPVFMAP PVHYPRGDGE 180 
BQRWIFEQT QYDVPSLATH SAYLKDDQRS TPDSTYSESF KDAATEKFRS AS VGAEEYMY 240 
DQTSSGTFQY TLEATKSLRQ KQGEGFMTYL NKGQFYATTL SETGDNKCFR HPISKVRS W 300 
MWFSEDKNR DEQLKYWKYW HSRQHTAKQR VLDIADYKES FNTIGNIEEI AYNAVSFTWD 360 
VNEEAKUTT VNCLSTDFSS QKGVKGLPLM IQIDTYSYNN RSNKPIHRAY CQKVFCDKG 420 
AERKXRDEEQ KQNRKNGKGQ ASQTQCNSSS DGKLAAIPLQ KKSDITYFKT MPDLHSQPVL 480 
FIPDVHFANL QRTGQVYYNT DDEREGGS VL VKRMFRPMEE EFGPVPSKQM KEEGTKRVLL 540 
YVRKBTDDVF D ALMLKSPTV MGLMEAISEK YGLPVEKIAK LYKKSKKGIL VNMDDNTJEH 600 
YSNEDIHLN MESMVEGFKV TLMEI 

SEO ID N03 BCU7 ONA SEQUENCE VARIANT 1: 

Nucleic Add Accession* AA428062 

Coding sequence: 1-777 {enfire sequence represents open reading frame) 

1 11 21 31 41 51 

I I I I I I 

ATGATAGCAA TCTCTGCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACCOTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAACCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTGAGATTT 360 

TTGGGCCAAA ATCTATCTGT ACGCACTGGA AGATATCGCT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATGCT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTO GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC TGTGTTTTCC AGGAGTTACG TCAAACTACC TGTACTGGTT TAAATAA 

SEQ ID HOA BCU7 DNA SEQUEKCE VARIANT 2: 

Nucleic Add Accession I: AA428062 

Cooing sequence; 1-777 {entire sequence represents open reading frame) 



1 11 21 31 41 51 
I I I I I I 

ATGATAGCAA TCTCTGCCGT CAGCAGTGCA CTCCTGTTCT CXXOTCTCTG TGAAGCAAGT 60 

ACCGTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATG GAATA T ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTGAGATTT 360 

TTGGGCCAAA ATCTATCTGT ACGCACTGGA AGATATCGCT CTATTCTCCA GT TGG T CA AG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATACT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC TGT G TTTT CC AGGAGTTACG TCAAACTACC TGTACTGGTT TAAATAA 

SEQPNO^^f^n6ggtfgnceYarfarrt.U 

Protein Accession* none 

1 11 21 31 41 51 

HXAXSAVSSA LLFSLLCEAS TWLLNSTDS SPPTNNPTDI EAALKAQLDS ADIPKARRKR 60 



303 



WO 02/30268 



YISQNDMIAI LDYHNQVRGK VFPPAANMKY MVWDENLAKS AEAKAATCITW BHGPSYLLRF 120 

LGQNLSVRTG RYRSILQLVK PWYDEVKDYA PPYPQDCNPR CPHRCFGFHC THYTQMVWAT 160 

SNRIGCAIHA CQNMNVWGSV WRRAVYLVCN YAPKGNWIGE APYKVGVPCS SCPPSYGGSC 240 
TDNLCFPGVT SNYLYWPK 

SEQiPNO* pcy7 protein wwww Vfrkrt 2; 
Protein Accession #: none 



1 11 21 31 41 51 

I I 1 I I I 

MIAISAVSSA LLPSLLCEAS TWLLNSTDS SPPTNNFTDI EAALKAQLDS ADIPKARRKR 60 

YISQNDMIAI LDYHNQVRGK VFPPAANMEY MVWDENLAKS AEAWAATCIW DHGPSYLLRF 120 

LGQNLSVRTG RYRSILQLVK PWYDEVKDYA FPYPQDCNPR CPMRCPGPMC THYTQMVWAT 180 

SNRIGCAIHT CQNMNVWGSV WRRAVYLVCN YAPKGNWIGE APYKVGVPCS SCPPSYGGSC 240 
TDNLCFPGVT SNYLYWPK 



SEQ ID K0:7 BCX2 ONA SEQUENCE 

Nudeic Add Accession f: NMJXJ3014 

Coding sequence: 238-1278 (underiined sequences <wresp<)nd to start and stop codons) 

1 11 21 31 41 51 
I I I I I i 

GGCGGGTTCG OGOOOOGAAG GCTGAG AGCT GGOGCTGCTC GTGCOCTGTG TGCCAGACGO 60 
CGGAGCTCCG CGGCCGGACC CCGCGGCCCC GCTTTGCTGC CGACTGGAGT TTGGGGGAAG 120 
AAACTCTCCT GOGOOCCAGA AGATTTCTTC CTCGGCGAAG GGACAGGGAA AGATGAGGGT 180 
GGCAGGAAGA GAAGGCGCTT TCTGTCTGCC GGGOTCGCAO OQCOAQAGGO CAGTGOCATQ 240 
TTOCTCICGA TCCTAGTGGC QCTOTO O CTG TGOCTCCACC TGGCGCTGGG CGTGCGCGGC 300 
GCGCCCTGCG AGGCGGTGCG CATCCCTATG TGCCGGCACA TCCCCTGGAA CATCACGCGG 360 
ATGCOCAACC ACCTGCACCA CAGCACGCAG GAGAACGCCA TCCTGGCCAT CGAGCAGTAC 420 
GAGG AGCTGG TGGACGTGAA CTGCAGCGCC GTGCTGGGCT TCTTCTTCTG TGCCATGTAC 480 
GCGCCCATTT GCAOCCTGGA GTTCCTGCAC GACCCTATCA AGCXX5TGCAA GTOGGTGTGC 540 
CAACGOGCGC GOGACGACTG GGAGCCOCTC ATGAAGATGT ACAAOCACAG CTGGCCCGAA 600 
AGOCTGGCCT GGGAOGAGCT GCCTGTCTAT GAOCGTGGCG TGTGCATTTC GGCTGAAGOC 660 
ATCGTCACGG ACCTCCOGGA GGATGTTAAG TGGATAGACA TCACACCAGA CATGATGGTA 720 
CAGGAAAGGC CTCTTGATGT TG ACTGTAAA OGGCTAAGOC CCGATCGGTG CAAGTGTAAA 780 
AAGGTG AAGC CAACTTTGGC AACGTATCTC AGCAAAAACT ACAGCTATGT TATTCATGOC 840 
AAAATAAAAG CTGTGCAGAG GAGTGGCTGC AATGAGOTCA CAAOGGTGGT GGATGTAAAA 900 
G AGATCTTCA AGTCCTCATC ACCCATCCCT OGAACTCAAG TCCCGCTCAT TACAAATTCT 960 
TCTTGGCAGT GTCCACACAT CCTGCCCCAT CAAG ATGTTC TCATCATGTG TTAOG AGTGG 1020 
CXHTCAAGGA TGATGCTTCT TCAAAATTGC TTAGTTGAAA AATGGAGAGA TCAGCTTAGT 1080 
AAAAGATCCA TACAGTGGGA AGAGAGGCTG CAGGAACAGC GGAGAACAGT TCAGGACAAG 1140 
AAGAAAACAG CGGGGGGCAC CAGTOGTAGT AATCCOCOCA AAOCAAAGGG AAAGCCTCCT 1200 
GCTCCCAAAC CAGCCAGTCC CAAGAAGAAC ATTAAAACTA GGAGTGCCCA GAAGAGAACA 1260 
AAOCXX3AAAA GAGTGQXzAGC TAACTAGTTT CCAAAGCGGA GACTTCCGAC TTCCTTACAG 1320 
GATGAGGCTG GGCATTGCCT GGGACAGOCT ATGTAAGGOC ATGTGCOOCT TGCOCTAACA 1380 
ACTCACTGCA G 1GC 1 V ITCA TAG ACACATC TTGCAGCATT TTTCTTAAGG CTATGCTTCA 1440 
GTTTTTCTTT GTAAGCCATC ACAAGCCATA GTGGTAGGTT TGCXXnTTGG TACAOAAGGT 1500 
GAGTTAAAGC TGGTGGAAAA GGCTTATTGC ATTGCATTCA GAGTAACCTG TGTGCATACT 1560 
CTAGAAGAGT AGGGAAAATA ATGCTTGTTA CAATTCGACC TAATATGTGC ATTGTAAAAT 1620 
AAATGCCATA TTTCAAACAA AACACGTAAT TTTTTTACAG TATGTTTTAT TACCTTTTGA 1680 
TATCTGTTGT TGCAATGTTA GTGATGTTTT AAAATGTGAT GAAAATATAA TGTTTTTAAG 1740 
AAGG AACAGT AGTGG AATGA ATGTTAAAAG ATCTTTATGT GTTTATGGTC TGCAGAAGGA 1800 
TTTTTGTGAT GAAAGGGG AT TTTTTGAAAA ATTAGAGAAG TAGCATATGG AAAATTATAA 1860 
TGTGTTTTTT TAOCAATG AC TTCAGTTTCT GTTTTTAGCT AGAAACTTAA AAACAAAAAT 1920 
AATAATAAAG AAAAATAAAT AAAAAGG AG A GGCAG ACAAT GTCTGG ATTC CTGTTTTTTG 1980 
GTTACCTGAT TTOCATG ATC ATGATGCTTC TTGTCAACAC CCTCTTAAGC AGCACCAGAA 2040 
ACAGTG AGTT TGTCTGTACC ATTAGGAGTT AGGTACTAAT TAGTTGGCTA ATGCTCAAGT 2100 
ATTTTATACC CACAAG AG AG GTATGTCACT CATCTTACTT CCCAGGACAT CCACXXTTG AG 2160 
AAT AATTTGA CAAGCTT AAA AATGGCCTTC ATGTGAGTGC CAAATTTTGT TTTTCTTCAT 2220 
TTAAATA I i HCU 1G CCTA AATACATGTG AGAGGAGTTA AATATAAATG TACAGAGAGG 2280 
AAAGTTG AGT TOCACCTCTG AAATG AGAAT TACTTGACAG TTGGGATACT TTAATCAGAA 2340 
AAAAAGAACT TATTTGCAGC ATTTTATCAA CAAATTTCAT AATTGTGGAC AATTGGAGGC 2400 
ATTTATTTTA AAAAACAATT TTATTGGOCT TTTGCTAACA CAGT AAGCAT GTATTTT ATA 2460 
AGGCATTCAA TAAATGCACA AOGCCCAAAG'GAAATAAAAT OCTATCTAAT OCTACTCPOC 2520 
ACTACACAG A GGTAATCACT ATTAGTATTT TGGCATATTA TTCTCC AGGT GTTTGCTTAT 2580 
GCACTTATAA AATG ATTTOA ACAAATAAAA CTAGGAACCT GTATACATGT GTTTCATAAC 2640 
CTCOCTOCTT TGCTTGGCCC TTTATTOAG A TAAGTTTTOC TGTCAAG AAA GCAG AAAOCA 2700 
TCTCATTTCT AACAGCTGTG TTATATTCCA TAGTATGCAT TACTCAACAA ACTGTTOTGC 2760 
TATTGGATAC TTAGGTGGTT TCTTCACTGA CAATACTGAA TAAACATCTC AOOGGAATTC 



SEQ©N0:8B 

Protein Accesstonfc NPjD03005i1 



1 U 21 31 41 51 
I I I I I I 

MFLSILVALC LWLHLALG VR GAPCEA VRIP MCRHMPWNIT RMPNHLHHST QENAILAIEQ 60 
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YEELVD VNCS AVLRFFPCAM YAPICTLEFL HDPIKPCKSV CQRARDDCEP LMKMYNHSWP 120 
ESLACDELPV YDRGVOSPE AIVTDLPEDV KWmiTPDMM VQERPLDVDC KRLSPDRCKC 180 
KKVKPTLATY LSKNYS YVM AKKAVQRSG CNBVTTWDV KEIFKSSSPI PRTQVPLTTN 240 
SSCQCPHILP HQD VUMCYE WRSRMMLLEN CLVEKWRDQL SKRSIQWEER LQEQRRTVQD 300 
KKKTAGRTSR SNPPKPKGKP PAPKPASPKK NIKTRSAQKR TOPKRV 

SEQIDNOS CBK1 DMA SEQUENCE 

Kucielc Add Accession* NMJJ32391 

Coding sequence: 129-302 (underlined sequences correspond to start and stop catena) 

1 11 21 31 41 51 

I I I I I I 

GTCCTTCCTC TCCTAGCCTA AGGCGTGCAA ACAGAGCGCC ACTGGGAGGC TGAAACCTTT 60 

AGGCCGATGC TTGCTTGCAA GGTCAGGCAA GCTGGATTCT GGTCCCCACC TTTGCAGAGA 120 

GAACAGC GAT GTTGTGCGCC CATTTCTCAG ATCAAGGACC GGCCCATCTT ACTACCTCCA 180 

AGAOTCCTTT TCTCTCTAAT AAGAAAACAT CTACTTTGAA ACATCTACTG GGCOAGACCA 240 

GGAGTGATGG CTCAGCCTGT AATTCTGGAA TTTCGGGAGG CCGAGGCAGG AAGATTCCTT 300 

6AGCACAGGA GTTCCAGACC AGCCTGGGCA ATGTAGCAAG ACGCTGTCTC TATTTATACA 360 
ATAAAATTTT TTTAAAAAAG G 



8EQIDHCfc10CPK1W^WVCTe« 
Protein Accession •: NPJ 15767 

1 11 21 31 41 51 

I I I I 1 I 

MLCAHFSDQG PAHLTTSKSA FLSNKKTSTL KHLLGETRSD GSACNSGISG GRGRK3P 

SEQIDK0:11 CHA1 DHA SEQUENCE 

Nudete Add Accession #: NM.0201B2 

Coding sequence: 9S-854 (underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I I I I 

TCCTTGSGTT CGGGTGAAAG CGCCTGGGGG TTCGTGGCCA TGATCCCC6A 6CTGCT6GA0 60 

AACTGAAGGC GGACAGTCTC CTGCGAAACC AGGCAATSGC 6GAGCT6GAO TTTGTTCAGA 120 

TCATCATCAT CGTGGTGQTQ ATGATGQTGA T6GT6GTGGT GATCACGTGC CTQCTGAGGC 180 

ACTACAAGCT OTCTGCACGG TCCTTCATCA GCCGGCACAG OCAG66G0GG AGGA6A6AA6 240 

ATGCCCTGTC CTCAGAA6GA TGCCTGTGGC CCTCGGAGAG CACAGTGTCA GGCAACGGAA 300 

TCCCAGAGCC GCAGGTCTAC GCCCCOCCTC GGCCCACCGA CCGCCTGGCC GTGCCGCCCT 360 

TCGCCCAGCG GGAGCGCTXC CACCGCTTCC AGCCCACCTA TCCGTACCTG CAGCACGAGA 420 

TCGAOCTGCC ACCCACCATC TCGCTGTCAG ACGGGGAGGA GCCCCCACCC TAOCAGGGCC 480 

CCTGCACCCT CCAGCTTOGG GACCCCGAGC AGCAGCTGGA ACTGAACCGG GAGTCGGTGC 540 

GCGCACCCCC AAACAGAACC ATCTTCGACA GTGACCTQAT GGAXAGTGCC AGGCTGGGCG 600 

GCCCCTGCCC CCCCAGCAGT AACTCGGGCA TCAGCGCCAC GTGCTACGGC AGCGGCGGGC 660 

GCATGGAGGQ GCCGCCGCCC ACCTACAGCG AGGTCATCGG CCACTACCCG GGGTCCTCCT 720 

TCCAGCACCA GCAGAGCAGT GGGCCGCCCT CCTTGCTGGA GGGGACCCGG CTCCACXACA 780 

CACACATCGC GCCOCTAGAG AGCGCAGCCA TCTGGAGCAA AGAGAAGGAT AAACAGAAAG 840 

GACACCCTCT CTAGG GTCCC CAGGGGGGCC GGGCTGGGGC TGCGTAGGTG AAAAGGCAGA 900 

ACACTCCGCG CTTCTTAGAA GAGGAGTGAG AGGAAGGCGG GGGGCGCAGC AAOGCATCGT 960 

GTGGCCCTCC OCTCCCACCT CCCTGTGTAT AAATATTTAC ATGTGATGTC TGGTCTGAAT 1020 

GCACAAGCTA AGAGAGCTTG CAAAAAAAAA AAGAAAAAAG AAAAAAAAAA ACCACGTTTC 1080 

TTTCTTGAGC TGTGTCTTGA AGGCAAAAGA AAAAAAATTT CTACAGTAAA AAAAAAAAAA 1140 
A 



SEQ ID NCfcl2 CHA1 Proten seouenc« 
Protein Accession* NP.064567 

1 11 21 31 41 SI 

I I I I I I 

HAELEFVQXI IIWVMMVMV WTTCLLSHY KLSARSFISR HSQGRRKBDA LSSBGCLWPS 60 
ESTVSGNGIP EPCVYAPPRP TDRLAVPPFA QRERFHRFQP TYPYLQHEID LPPTISLSDG 120 
EEPPPYQGPC TXiQLKDPEQQ LBLNRESVRA PPHRTIFDSD LMDSARLGGP CPPSSNSGIS 180 
ATCYGSGGRM EGPPPTYSEV IGHYPGSSFQ RQQSSGPPSL LEGTRLHHTH IAPLESAAIW 240 
SKBKDKQKGH PL 

8EQ \D H0-A3 CJA5 DMA SEQUENCE 

Nudefc Add Accession #: NM.012445 

Coding sequence: 278-1271 (underfilled sequences ccBiespond to start end atop codons) 

1 11 21 31 41 51 

I I I I I I 
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GCACGAGGGA AGAGGGTGAT CCGACCCGGG GAAGGTCGCT GGGCAGGGCG AGTTGGGAAA 60 

GCGGCAGCCC CCGCCGCCCC CGCAGCCCCT TCTCCTCCTT TCTCCCACGT CCTATCTOCC 120 

TCTCGCTGGA GGCCAGGCCG TGCAGCATCG AAGACAGGAG GAACTGGAGC CTCATTGGOC 180 

66CC0GGGGC GCCG6CCTOG GGCTTAAATA GGA 6CT C0GG GCTCTGGCTG GGACCCGACC 240 

GCTGCCGGCC GCGCTCCCGC TGCTCCTCCC GGGTGATGGR AAACCCCAGC CCGGOCGCCO 300 

CCCTGGGCAA CGCCCTCTGC GCTCTCCTCC TGGCCACTCT CQGCGCCGCC GGOCAGCCTC 360 

TTGGGGGAGA GTCCATCTGT TCCGCCAGAG CCCCGGCCAA ATACAGCATC ACCTTCACGG 420 

GCAAGTGGAG CCAGACGGCC TTCCCCAAGC AGTACCCCCT GTTCCGCCCC CCTGCGCAGT 480 

GGTCTTCGCT 6CTG6GGGCC GCGCATAGCT CCGACTACAG CATGTGGAGO AAGAACCAGT 540 

ACGTCAGTAA CGGGCTGCGC GACTTTGCGG AGCGCGGCGA GGCCTGGGCO CTGATGAAGG 600 

AGATCGAGGC GGCGGGGGAG GCGCTGCAGA GCGTGCACGC GGTGTTTTCG GCGCCCGCCO 660 

TCCCCAGCGG CACCGGGCAO ACGTCGGOGG AGCTGGAGGT GCAGCGCAG6 CACTC6CTG0 720 

TCTCGTTTGT 6GTGCGCATC GT6CCCA6CC CCGACTGGTT CGTGGGCGTG GACAGCCT6G 780 

ACCTOTGCGA CGGGGACCGT TGGCGGGAAC AGGCGGCGCT GGACCTGTAC CCCTACGACG B40 

CCGGGAOGGA CAGCGGCTTC ACCTTCTOCT OCCCCAACTT CGCCACCATC CCGCAGGACA 900 

CGGTGACCGA GATAACGTCC TCCTCTCCCA GCCACCCGGC CAACTCCTTC TACTACCCGC 960 

GGCTGAAGGC CCTGCCTCCC ATCGCCAGGQ TGACACTGGT GCGGCTGGGA CAGAGCCCCA 1020 

GGGCCTTCAT CCCTCCCGCC CCAGTCCTGC CCAGCAGGGA CAATGAGATT GTAGACAGCG 1080 

CCTCAGTTCC AGAAACGOOG CTGGACTGCG AGGTCTCCCT GTGGTCGTCC TGGGGACTGT 1140 

GCGGAGGCCA CTGTGGGAGG CTOGGGACCA AGAGCAGGAC TCGCTACGTC CGGGTCCAGC 1200 

CCGCCAACAA CGGGAGCOCC TQOCC C GAGC TCGAAGAAGA GGCTGAGTGC GTCGCTGAXA 1260 

ACTGCGT CTA AG AOCACAGC CCCOCAGOCC CTGGGGCCCC CGGAGCCATG GGGTGTCGGG 1320 

GGCTCCTGTG CAGGCTCATG CTGCAGGOGG CCGAGGCACA GGGGGTTTCG CGCTGCTCCT 1380 

GACCGCGGTG AGGCCGCGCC GACCATCTCT GCACTGAAGG GCCCTCTGGT GGCCGGCACG 1440 

GGCATTGGGA AACAGCCTCC TCCTTTCCCA ACCTTGCTTC TTAGGGGCCC CCOTGTOCCG 1500 

TCTGCTCTCA GCCTOCTCCT CCTGCAGGAT AAAGTCATCC CCAAGGCTOC AGCTACTCTA 1560 

AATTATGGTC TCCTTATAAG TTATTGCTGC TCCAGGAGAT TGTCCTTCAT CGTCCAGGGG 1620 

CCTGGCTCCC ACGTGGTTGC AGATACCTCA GACCTGGTGC TCTAGGCTGT GCTGAGCCCA 1680 

CTCTCCCGAG GGCGCATCCA AGCGGGGGCC ACTTGAGAAG TGAATAAATG GGGCGGTTTC 1740 

GGAAGCGTCA GTGTTTCCAT GTTATGGATC TCTCTGCGTT TGAATAAAGA CTATCTCTGT 1800 
TCCTCAC 



seq id HftttCMspfflrtiramwg 

Protein Accession* NP.036577 

1 11 21 31 41 51 

I I - I 1 I I 

MENPSPAAAL GKALCALLLA TLGAAGQPLG GESICSARAP AKYSITFTGK WSQTAFPKQY 60 

PLFRPPAQWS SLLGAAHSSD YSMWRKNQYV SNGLRDFAER GEAWALMKEI EAAGEALQSV 120 

HAVFSAPAVP SGTGQTSAKL EVQRRHSLVS PWRIVPSPD WPVGVDSLDL CDGDRWREQA 180 

ALDLYPYDAG TDSGFTFSSP NFATIPQDTV TEZTSSSPSH PANSFYYPRL KALPPIARVT 240 

LVRLRQSPRA FIPPAPVLPS RDNBIVDSAS VPETPLDCW SLWSSWGICG GBCGRLGTKS 300 
RTRYVRVQPA NNGSPCPELE EEAECVPDNC V 



SEQ ID NChlS LBH9 DNA SEQUENCE 

Nucleic Add Accession #: NM.002391 

Coding sequence: 26-457 {underlined sequences correspond to start and stop codorts) 

1 11 21 31 41 51 

I I I I I I 

CGGGCGAAGC AGCGCGGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCCTCCT 60 

CGCCCTGCTG GCGCTCACCT CCGCGGTCGC CAAAAAGAAA GATAAGGTGA AGAAGGGCGG 120 

CCCGGGGAGC GAGTGCGCTG AGTGGGCCTG GG6GCCCTGC ACCCCCAGCA GCAAGGATTG 180 

CGGCGTGGGT TTCCGCGAGG G CACCT GCGG GGCCCAGACC CAGCGCATCC GGTGCAGGGT 240 

GCCCTGCAAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTGGGG 300 

TGCGfrGTGAT GGGGGCACAG GCACCAAAGT CCGCCAAGGC ACCCTGAAGA AGGCGCGCT A 360 

CAATGCTCAG TGOCAGGAGA CCATCCGCGT CACCAAGCCC TGCACCCCCA AGAOCAAAGC 420 

AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GQACTAGACG CCAAGCCTGG ATGCCAAGGA 480 

GCCCCTGGTG TCACATGGGG CCTGGCCACG CCCTCCCTCT CCCAGGCCCG AGATGTGACC 540 

CACCAGTGOC TTCTGTCTGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 600 

ACTCCCCAGC CCCACCCCTA AGTGCCCAAA GTGGGGAGGG ACAAGGGATT CTGGGAAGCT 660 

TGAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCOOGCTT TTGTTCTTCC CCACAATTCC 720 

ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCOOCC CAATAAAAGC TCTTCTTTTT 780 
TAATAT 



SEQ (0 N0:16 LBH9PrrtefnsgqW)cg; 
Protein Accession*: NP.002382 

1 11 21 31 41 51 

I I I I I I 

KQHRGFLU.T LLALLALTSA VAKKKDKVKK GGPGSECAEW AWGPCTPSSK DCGVGPREGT 60 
CGAQTQRZRC KVPCNWKKEF GADCKYKFEN WGACDGGTGT KVRQGTLKKA RYNAQCQETI 120 
RVTKPCTPKT KAKAKAKKGK GKD 
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SEQ D Ka.17 LEM9 DNA SEQUENCE 

NudefcAcW Accession* NRJJ05244 

Coding sequence: 1-1617 (underlined sequences correspond to start and slop codons) 



1 11 21 31 41 51 

I 1 I I I I 

ATGGTAGAAC TAGTGATCTC ACCCAGCCTC ACTGTAAACA GCGATTGTCT GGATAAACTG 60 

AAGTTTAACC GTGCTGACGC TGCTGTGTGG ACTCTCAGTG ACAGACAAGQ CATCACCAAA 120 

TCGGCCCCCC TGAGAGTGTC CCAGCTCTTC TCCAGATCTT GCCCACGTGT CCTCCCCCGC 180 

CAGCCTTCCA CAGCCATGGC AGCCTACGGC CAGACGCAGT ACAGTGCGGG GATCCAGCAG 240 

GCTACCCCCT ATACAGCTTA CCCACCTCCA GCACAAGCCT ATGGAATCCC TTCCTACAGC 300 

ATCAAGACAG AAGACAGCTT GAACCATTCC CCTGGCCAGA GTGGATTCCT CAGCTATGGC 360 

TCCAGCTTCA GCACCTCACC CACTGGACAQ AGCCCATACA CCTACCAGAT GCACGGCACA 420 

ACAGGGTTCT ATCAAGGAGG AAATGGACTG GGCAACGCAG CCGGTTTCGG GAGTGTGCAC 480 

CAGGACTATC CTTCCTACCC CGGCTTCO CC CAGAGCCAGT ACCCCCAGTA TTACGGCTCA 540 

TCCTACAACC CTCCCTACGT C CCGG CCAOC AGCATCT GCC CTTCGCCCCT CTCC ACGT CC 600 

ACCTACGTCC TCCAGGAGGC ATCTCACAAC GTCCCCAACC AGAGTTCCGA GTCACTTGCT 660 

GGTGAATACA ACACACACAA TGGACCTTCC ACACCAGCGA AAGAGGGAGA CACAGACAGG 720 

CCGCACCGGG CCTCCGACGG GAAGCTCCGA GGCCGGTCTA AGAGGAGCAG TGACCCGTCC 780 

CCGGCAGGGG ACAATGAGAT TGAGCGTGTG TTCCTOTGGQ ACTTGGATGA GACAATAATT 840 

ATTTTTCACT CCTTACTCAC GGGGACAHTTT GCATCCAGAT ACGGGAAGGA CACCACGACG 900 

TCCGTGCGCA TTGGCCTTAT GATGGAAGAG ATGATCTTCA ACCTTGCAGA TACACATCTG 960 

TTCTTCAATG ACCTGGAGGA TTGTGACCAG ATCCACGTTG ATGACGTCTC ATCAGATGAC 1020 

AATGGCCAAG ATTTAAGCAC ATACAACTTC TCCGCTGACG GCTTCCACAG TTCGGCCCCA 1080 

GGAGCCAACC TGTGCCTGGG CTCTGGCGTG CACGGCGGCQ TGGACTGGAT GAGGAAGCTG 1140 

GCCTTCCGCT ACCGGCGGGT GAAGGAGATG TACAATACCT ACAAGAACAA CGTTGGTGGG 1200 

TTGATAGGCA CTCCCAAAAG GGAGACCTGG CTACAGCTCC GAGCTGAGCT GGAAGCTCTC 1260 

ACAGACCTCT GGCTGAOCCA CTCCCTGAAG GCACTAAACC TCATCAACTC CCGGCCCAAC 1320 

TGTGTCAATG TGCTGGTCAC CAOCACTCAA CTAATTCCTG CCCTGGCCAA AGTOCTGCTA 1380 

TATGGCCTGG GGTCTGTGTT TCCTATTGAG AACATCTACA GTGCAACCAA GACAGGGAAG 1440 

GAGAGCTGCT TCGAGAGGAT AATGCAGAGA TTCGGCAGAA AAGCTGTCTA CGTGGTGATC 1500 

GGTGATGGTG TGGAAGAGGA GCAAGGAGCG AAAAAGCACA ACATGCCTTT CTGGCGGATA 1560 
TCCTGCCACG CAGACCTGGA GGCACTGAGG CACGCCCTGG AACTGG AGTA TTTAXAQ 

SEQ ID N0:16 LSM9 Protein sequence; 
Protein Accession*: NP.005235 



1 11 21 31 41 51 

111 III 

HVELVISFSL TVNSDCLDKL KFNRADAAVW TLSDRQGXTK SAPLRVSQLF SRSCPRVLPR 60 

QPSTAMAAYG QTQYSAGIQQ ATPYTAYPPP AQAYGIPSYS IKTBDSLNHS PGQSGFLSYG 120 

SSPSTSPTGQ SPYTYQMHGT TGFYQGGNGL GKAAGPGSVE QDYPSYPGFPP QSQYPQYYGS 180 

•'SYXIPPYVPAS 'SICPSFLSTS TYVLQEASHN <VPNQSSBSLA GEVOTHNGPS TPAKEGDTDR 240 

PHRASDGKLR GRSKRSSDPS PAGDNEIERV FVWDLDETII XFHSLLTQTF ASRYGKDTTT 300 

SVB1QUOSSB mFULADTSL PPKDLEDCDQ IHVDDVSSDD NGQDLSTYHF SADGFHSSAP 360 

GANLCLGSGV HGGVUWMRKL AFRYRKVKEM YNTYKNNVGG LIGTPKRETW LQLRAELEAL 420 

TDLWLTHSLK ALNLINSRPN CVNVLVTTTQ LIPALAKVLL YGLGSVPPIE NIYSATKTGK 480 
ESCPERIHQR FGRKAVYWX GDGVEEEQGA KKHNMPPWRI SCHADLEALR HALBLEYL 



SEQ D N0*9 QAA1 OKA SEQUENCE 

Nucleic Arid Accession t: NMJD02740 

Cooing sequence: 178-1968 (undefined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I I I I I 

CCGCGGTTCC GGCTGCTCCG GCGAGGCGAC CCTTGGGTCG GOGCTGCGGG CGAGGTGGGC 60 

AGGTAGGTGG GCGGACGGCC GCGGTTCTCC GGCAAGCGCA GGCGGCGGAG TCCCCCACGG 120 

CGCCCOAAGC GCOCCCCGCA CCCCCGGCCT CCAGCGTTGA GGGGGGGGAG TGAGGAGATG 180 

CCGACCCAGA GGGACAGCAO CACCATGTCC CACACGGTCG CAGGCGGCGG CAGCGGGGAC 240 

CATTCCCACC AGGTCCGGGT GAAAGCCTAC TACOGCGGGG ATATCATGAT AACACATTTT 300 

GAACCTTCCA TCTCCTTTGA GGGCCTTTGC AATGAGGTTC GAGACATGTG TTCTTTTGAC 360 

AACGAACAGC TCTTCACCAT GAAATGGATA GATGAGGAAG GAGACCCGTG TACAGTATCA 420 

TCTCAGTTGG AGTTAGAAGA AGCCTTTAGA CTTTATGAGC TAAACAAGGA TTCTGAACTC 480 

TTGATTCATG T CT TOOCTTQ TGTACCAGAA CGTCCTGGGA TGCCTTGTOC AGGAGAAGAT 540 

AAATCCATCT ACCGTAGAGG TGCACGCOGC TGGAGAAAGC TTTATTGTGC CAATGGCCAC 600 

ACTTTCCAAG CCAAGCGTTT CAACAGGCGT GCTCACTGTG CCATCTGCAC AGACCGAATA 660 

TGGGGACTTG GACGCCAAGG ATATAAGTGC ATCAACTGCA AACTCTTGGT TCATAAGAAG 720 

TGCCATAAAC TCGTCACAAT TGAATGTGGG CGGCATTCTT TGCCACAGGA ACCAGTGATG 780 

CCCATGGATC AGTCATCCAT GCATTCTGAC CATGCACAGA CAGTAATTCC ATATAATCCT 840 

TCAAGTCATG AGAGTTTGGA TCAAGTTGGT GAAGAAAAAG AGGCAATGAA CACCAGGGAA 900 

AGTGGCAAAG CTTCATCCAG TCTAGGTCTT CAGGATTTTG ATTTGCTCCG GGTAATAGGA 960 

AGAGGAAGTT ATGCCAAAGT ACTGTTGGTT CGATEAAAAA AAACAGATCG TATTTATGCA 1020 

ATGAAAGTTG TGAAAAAAGA GCTTGTTAAT GATGATGAGG ATATTQATTG GGTACAGACA 1080 

GAGAAGCATG TGTTTGAGCA GGCATOCAAT CATCCTTTCC TTGTTGGGCT GCATTCTTGC 1140 

TTTCAGACAG AAAGCAGATT GTTCTTTGTT ATAGAGTATG TAAATGGAGG AGACCTAATG 1200 

TTTCATATGC AGCGACAAAG AAAACTTOCT GAAGAACATG CCAGATTTTA CTCTGCAGAA 1260 
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ATCAGTCTAG CATTAAATTA TCTTCATGAG 
GACAATGTAT TACTGGACTC TGAAG6CCAC 
GAAGGATTAC GGCCAGGAGA TACAACCAGC 
CCTGAAATTT TAAGAGGAGA AGATTATGGT 
CTCATGTTTG AGATGATGGC AGGAAGGTCT 
CCTGACCAGA ACACAGAGGA TTATCTCTTC 
CCACGTTCTC TGTCTGTAAA AGCTGCAAGT 
AAGGAACGAT TGGGTTGTCA TCCTCAAACA 
TTCCGAAATQ TTGATTGGGA TATGATGGAG 
AATATTTCTG GGGAATTTGG TTTGGACAAC 
CAGCTCACTC CAGATGACGA TGACATTGTO 
TTTGAGTATA TCAATCCTCT TTTGATGTCT 
AACCATGTAT TCTACTCATG TTCCCATTTA 
TACAATTAAC CATCTTATAT TTGCCACCTA 
ACTATATGAA TCAATTATTA CATCTGTTTT 
TCCAGACAAT CATGTCAAAA TTTAGTTGAA 
ATGAGTAATG AAGTTACCTT TTTTGTTTAA 



CGAGGGATAA TTTATAGAGA TTTGAAACTG 1320 

ATTAAACTCA CTGACTACGG CATGTGTAAG 1380 

ACTTTCTGTG GTACTCCTAA TTACATTOCT 1440 

TTCAGTGTTG A CTC GTQQQC TCTTGGAGTQ 1500 

CCATTTGATA TTGTTGGGAG CTCCGATAAC 1560 

CAAGTTATTT TGGAAAAACA AATTCGCATA 1620 

QTTCTGAAGA GTTTTCTTAA TAAGGACCCT 1660 

GQATTTGCTG ATATTCAGGG ACACCCOTTC 1740 

CAAAAACAGG TGGTACCTCC CTTTAAACCA 1800 

TTTGATTCTC AGTTTACTAA TGAAOCTGTC I860 

AGGAAGATTG ATCAGTCTGA ATTTGAAQGT 1920 

GCAGAAGAAT GTGT CTQAT C CTCATTTTTC 1980 

ATGCATCQAT AAACTTOCTC CAAGCCTGGA 2040 

CAAAAAAACA CCCAATATCT TCTCTTCTAG 2100 

ACTATGAAAA AAAAATTAAT ACTACTAGCT 2160 

CTGGTTTTTC ACTTTTTAAA AGGCCTACAG 2220 
AAAAAAAAAA G 



SEQ ID NQgQ QAM PTC t gft S WV TO 
Protein Accession #: NPJ002731 

1 11 21 31 41 51 

I I I I I I 

MSHTVAGGGS GDHSHQVRVK AYYRGDIMIT HPEPSISFEG LCNEVRDMCS FDNEQLFTMK 60 
WIDEEGDPCT VSSQLELEEA FRLYELNKDS ELLIHVFPCV PERPGMPCPG EDKSIYKRGA 120 
RRMRKLYCAN GHTFQAKRFN RRAHCAICTD RIWGLGRQGY KCINCKLLVH KKCHKLVTIE 180 
CGHHSLPQEP VMPMDQSSMH SDHAQTVIPY NPSSBBSLDQ VGEBKEAMNT RESGKASSSL 240 
GLQDFDLLRV IGRGSYAKVL LVRLKKTDRI YAMKWKKEL VNDDEDIDWV QTEKHVFEQA 300 
SNHPPLVGLH SCPQTESRLF PVIEYVNGGD LMFHKQRQRK LPEBHARFYS ABISLALNYL 360 
HERGIXYRDL KLDNVLLDSE GHIKLTDYGK CKEGLRPGDT TSTFCGTPNY ZAPBXLRGED 420 
YGFSVDWWAL GVLMFEHKAG RSPFDIVGSS DNPDQNTEDY LFQVZLEKQZ RIPRSLSVKA 480 
ASVLKSPLNK DPKERLGCHP QTGFADIQGH PPFRNVDWDM HEQKQWPPF KPNISGEPGL 540 
DNFDSQFTNE PVQLTPDDDD XVRKXDQSEF EGPEYINPLL HSAEECV 



SEQ ID N021 0BH2 DMA SEQUENCE 

Nucleic Acid Accession ft 105628 

Coding sequence: 1 97*4792 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

1 I I I I I 

CCAGGCGGCG TTG CGG CCCC GGCCCCGGCT CCCTGCGCCG CCGCCGCCGC CGCCGCCGOC 60 

6CCGCCGCC6 CCGCCGCCAG CGCTAGCGCC AGCAGCCGGG CCCGATCACC CGCCGCCCGG 120 

TGCOCGCCGC CGCCCGCGCC AGCAACCGGG CCCGATCACC CGCCGCCCGG TGCCCGCCGC 180 

CGCCCGCGCC A CCGGCATGQ CGCTCCGGGG CTTCTGCAGC GCCGATGGCT CCGACCCGCT 240 

CTGGGACTGG AATGTCACOT GGAATACCAG CAACCCCGAC TTCACCAAGT GCTTTCAGAA 300 

CACGGTCCTC GTGTGGGTGC CTTGTTTTTA CCTCTGGGCC TGTTTCCCCT TCTACTTCCT 360 

CTATCTCTCC CGACATGACC GAGGCTACAT TCAGATGACA CCTCTCAACA AAACCAAAAC 420 

TGCCTTGGGA TTTTTGCTGT GGATCGTCTG CTGGGCAGAC CTCTTCTACT CTTTCTGGGA 480 

AAGAAGTCGG GGCATATTCC TGGCCCCAGT GTTTCTGGTC AGCCCAACTC TCTTGGGCAT 540 

CACGACGCTG CTTGCTACCT TTTTAATTCA GCTGGAGAGG AGGAAGGGAG TTCAGTCTTC 600 

AGGGATCATG CTCACTTTCT GGCTGGTAGC CCTAGTGTOT GCCCTAGCCA TCCTGAGATC 660 

CAAAATTATG ACAGCCTTAA AAGAGGATGC CCAGGTGGAC CTGTTTCGTG ACATCACTTT 720 

CTACGTCTAC UVm a X T CT TACTCATTCA GCTCGTCTTG TCCTGTTTCT CAGATCGCTC 780 

ACCCCTGTTC TCGGAAACCA TCCACGACCC TAATCCCTGC CCAGAGTCCA GCGCTTCCTT 840 

CCTGTCGAGG ATCACCTTCT GGTGGATCAC AGGGTTGATT GTCCGGGGCT ACOGCCAGCC 900 

CCTGGAGGGC AGTGACCTCT GGTCCTTAAA CAAGGAGGAC ACGTCGGAAC AAGTCGTGCC 960 

TGTTTTGGTA AAGAACTGGA AGAAGGAATG CGCCAAGACT AGGAAGCAGC CGGTGAAGGT 1020 

TGTGTACTCC TCCAAGGATC CTGCCCAGCC GAAAGAGAGT TCCAAGGTGG ATGCGAATGA 1080 

GGAGGTGGAG GCTTTGATCG TCAAGTCCCC ACAGAAGGAG TGGAACCCCT CTCTGTTTAA 1140 

GGTGTTATAC AAGACCTTTG GGCCCTACTT CCTCATGAGC TTCTTCTTCA AGGCCATCCA 1200 

CGACCTGATG ATGTTTTCCG GGCCGCAGAT CTTAAAGTTG CTCATCAAGT TCGTGAATGA 1260 

CACGAAGGCC CCAGACTGGC AGGGCTACTT CTACACCGTG C TOC TOT T T B TCACTGCCTS 1320 

CCTGCAGACC CTCGTG C TG C ACCAGTACTT CCACATCTGC TTCGTCAGTG GCATGAGGAT 1360 

CAAGACCGCT GTCATTGGGG CTGTCTATCG GAAGGCCCTG GTGATCACCA ATTCAGCCAG 1440 

AAAATCCTCC ACGGTCGGGG AGATTGTCAA CCTCATGTCT GTGGACGCTC AGAGGTTCAT 1500 

GGACTTGGCC ACGTACATTA ACATGATCTG GTCAGCCCCC CTGCAAGTCA T CCTTGCT C T 1560 

CTACCTCCTG TGGCTGAATC l\* 3 GOCCTTC CGTCCT GG CT GGAGTGGCGG TGATGGTCCT 1620 

CATGGTGCCC GTCAATGCTG TGATGGCGAT GAAGACCAAG ACGTATCAGG TGGCCCACAT 1680 

GAAGAGCAAA GACAATCGGA TCAAGCTGAT GAACGAAATT CTCAATGGGA TCAAAGTGCT 1740 

AAAGCTTTAT GCCTGGGAGC TGGCATTCAA GGACAAGGTG CTGGCCATCA GGCAGGAGGA 1800 

GCTGAAGGTG CTGAAGAAGT CTGCCTACCT GTCAGCCGTG GGCACCTTCA CCTGGGTCTG 1860 

CACGCCCTTT CTGGTGGCCT TGTGCACATT TGCCGTCTAC GTGACCATTG ACGAGAACAA 1920 

CATCCTGGAT GCCCAGACAG CC1TCGTGTC U T J U U JUTT G TTCAACATCC T C U U tflTl U; 1980 

CCTGAACATT CTCCCCATGG TCATCAGCAG CATCGTGCAG GCGAGTOTCT CCCTCAAACG 2040 

CCTGAGGATC TTTCTCTCCC ATGAGGAGCT GGAACCTGAC AGCATCGAGC GACGGCCTGT 2100 

CAAAGACGGC GGGGGCACGA ACAGCATCAC CGTGAGGAAT GCCACATTCA CCTGGGC CA Q 2160 

GAGCGACCCT CCCACACTGA ATGGCATCAC CTTCTCCATC CCCGAAGGTG CmtXHGGC 2220 
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CGTOGTGGGC CAGGTGGGCT GCGGAAAGTC GTCCCT GC TC TCAGCCCTCT TGGCTGAGAT 2260 

GCACAAAGTG GAGGGGCACG TGGCTATCAA 66GCTCCGT6 GCCTATGTGC CACAGCAGGC 2340 

CTGGATTCAG AATGATTCTC TCCGAGAAAA CATCCTTTTT GGATGTCAGC TGGAGGAACC 2400 

ATATTACAOO TCCGTGATAC AGGCCTOTQC CCTCCTCCCA GACCTGGAAA TCCTGCCCAQ 2460 

TGGGGATCGG ACAGAGATTG GCGAGAA860 CGTGAACCTO TCTCGGGGCC A6AAGCAGCG 2520 

CGTGAGCCTO GCCCGGGCCO TGTACTCCAA CGCTGACATT TACCTCTTCG ATGATCCCCT 2580 

CTCAGCAGTG GATGCCCATG TGGGAAAACA CATCTTTGAA AATGTGATTG GCCCCAAGGG 2640 

GATGCTGAAG AACAAGACGC GGATCTTGGT CACGCACAGC ATQAGCTACT TGCCGCAGGT 2700 

GGACOTCATC ATCGTCATGA GTGGCGGCAA GATCTCTGAG ATGGGCTCCT ACCAGGAGCT 2760 

GCTGGCTCGA GACGGCGCCT TCGCTGAGTT CCTGCGTACC TATGCCAGCA CAGAGCAGGA 2820 

GCAGOATGCA GAGGAGAACQ GGGTCACGGG CGTCAGCGGT CCAGGGAAGG AAGCAAAGCA 2880 

AATGGAGAAT GGCATGCTGO TGACGGACAG TGCAGGGAAG CAACTQCAGA GACAGCTCAG 2940 

CAG C TCCTCC TCCTATAGTG GGGACATCAG CAGGCACCAC AACAGCACCG CAGAACTGCA 3000 

GAAAGCTGAO GCCAAGAAGG AGGAGACCTG GAAGCTGATG GAGGCTGACA AGGCGCAQAC 3060 

AGGGCAGGTC AAGCTTTCCG TGTACTGGGA CTACATGAAG GCCATCGGAC TCTTCATCTC 3120 

CTTCCTCAGC ATCTTCCTTT TCATGTGTAA CCATGTGTCC GCGCTGGCTT CCAACTATTG 3180 

GCTCAGCCTC TGGACTGATG ACCCCATCGT CAACGGGACT CAGGAGCACA CGAAAGTCCG 3240 

GCTGAGCGTC TATGGAGCCC TGGGCATTTC ACAAGGGATC GCCGTGTTTG GCTACTCCAT 3300 

GGCCGTGTCC ATCGGGGGGA TCTTGGCTTC CCGCTGTCTG CACGTGGACC TGCTGCACAG 3360 

CATCCTG CGQ TCACCCATGA GCTTCTTTGA GCGGACCCCC AGTGGGAACC TGGTGAACCG 3420 

CTTCTCCAAG GAGCTGGACA CAGTGGACTC CATGATCCCG GAGGTCATCA AGATGTTCAT 3480 

GGGCTCCCTG TTCAACGTCA TTGGTGCCTG CATCGTTATC CTGCTGGCCA OGOCCATCGC 3540 

CGCCATCATC ATCCCGCCCC TTGGCCTCAT CTACTTCTTC GTCCAGAGGT TCTACGTGGC 3600 

TTCCT0CC6G CAGCTGAAGC GCCTCGAGTC GGTCAGCCGC TCCCCGGTCT ATTCCCATTT 3660 

CAACGAGACC T TGCTG GGGG TCAGCGTCAT TCGAGCCTTC GAGGAGCAGG AGCGCTTCAT 3720 

CCACCAGAGT GACCTGAAGG TGGACGAGAA CCAGAAGGCC TATTACCCCA GCATCGTGGC 3780 

CAACAGGTGG CTGGCCGTGC GGCTGGAGTG TGTGGGCAAC TGCATCGTTC TGTTTGCTGC 3840 

CCTGTTTGCG GTGATCTCCA GGCACAGCCT CAGTGCTGGC TTGGTGGGCC TCTCAGTGTC 3900 

TTACTCATTG CAGGTCACCA CGTACTTGAA CTGGCTGGTT CGGATGTCAT CTGAAATGGA 3960 

AACCAACATC-GTGGCCGTGG AGAGGCTCAA GGAGTATTCA GAGACTGAGA AGGAGGCGCC 4020 

CTGGCAAATC CAGGAGACAG CTCCGCCCAG CAGCTGGCCC CAGGTGGGCC GAGTGGAATT 4080 

CCGGAACTAC TGCCTGCGCT ACCGAGAGGA CCTGGACTTC GTTCTCAGGC ACATCAATGT 4140 

CACGATCAAT GGGGGAGAAA AGGTCGGCAT CGTGGGGCGG ACGGGAGCTG GGAAGTCGTC 4200 

CCTQACCCTG GGCTTATTTC GGATCAACGA GTCTGCCGAA GGAGAGATCA TCATCGATGG 4260 

CATCAACATC GCCAAGATCG GCCTGCACGA CCTCCGCTTC AAGATCACCA TCATCCCCCA 4320 

GGACCCTGTT TTGTTTTCGQ GTTCCCTCCG AATGAACCTG GACCCATTCA GCCAGTACTC 4380 

GGATGAAGAA GTCTGGACGT CCCTGGAGCT GGCCCACCTG AAGGACTTCG TGTCAGCCCT 4440 

TCCTGACAAG CTAGACCATG AATGTGCAGA AGGCGOGGAG AACCTCAGTG TCGGGCAGCQ 4500 

CCAGCTTGTG TGCCTAGCCC GGGCCCTGCT GAGGAAGACG AAGATCCTTG TGTTGGATGA 4560 

GGCCACGGCA GCC6T6QACC TGGAAACGGA CGACCTCATC CAGTCCACCA TCCGGACACA 4620 

GTTCGAGGAC TGCACCGTCC TCACCATCGC CCACCGGCTC AACACCATCA TGGACTACAC 4680 

AAGGGTGATC GTCTTGGACA AAGGAGAAAT CCAGGAGTAC GGCGCCCCAT CGGACCTCCT 4740 

GCAGCAGAGA GGTCTTTTCT ACAGCATGGC CAAAGACGCC GGCTTGGT GT GAG CCCCAOA 4800 

GCTGGCATAT CTGGTCAGAA CTGCAOGGCC TATATGCCAG CGCCCAGGGA GGAGTCAGTA 4860 

CCCCTGGTAA ACCAAGCCTC -CCACACTGAA ACCAAAACAT AAAAACCAAA CCCAGACAAC 4920 

CAAAACATAT TCAAAGCAGC AGCCACCGCC ATCCGGTCCC CTGOCTGGAA CTGGCTGTGA 4980 
AGACCCAGGA GAGACAGAGA TGCGAACCAC C 



1 11 21 31 41 51 

I I I I I I 

HALRGFCSAD GSDPLWDWNV TWNTSNPDFT KCFQNTVLWf VPCFYLWACF PFYFLYLSRH 60 

DRGYIQMTPL NKTKTALGFL LWIVCWADLP YSFWERSRGI FIAPVFLVSP TLLGITTLLA 120 

TFLIQLERRK GVQSSGIMLT FWLVALVCAL AILRSKIMTA LKEDAQVDLP RDITFYVYFS 180 

LLLIQLVLSC PSDRSPLFSE TIHDPNPCPE SSASFLSRIT FWWITGLIVR GYRQPLEGSD 240 

LWStWKEDTS EQWPVLVKN WKKECAKTRK QPVKWYSSK DPAQPKESSK VDANEEVEAL 300 

IVKSPQKEWN PSLFKVLYKT FGPYFLMSFF PKAIHDLMMP SGPQILKLLI KFVNDTKAPD 360 

WQGYFYTVLL FVTACLQTLV LHQYFHICFV SGHRIKTAVZ GAVYRKALVI TNSARKSSTV 420 

GEIVNLHSVD AQRFMDLATY IKMIWSAPLQ VXLALYLLHL HLGPSVLAGV AVHVLKVPVM 480 

AVMAMKTKTY QVAHMKSKDN RIKLMNEILN GIKVLKLYAW ELAFKDKVLA IRQEELKVLK 540 

KSAYLSAVOT FTWVCTPPLV ALCTPAVYVT IDENNILDAQ TAPVSLALFN ILRPPLNILP 600 

KVTSSIVQAS VSLKRLRIFL SHEELEPDSI BRRPVKDGGG TNSITVRNAT PTWARSDPPT 660 

LNGTTFSIPB GALVAWGQV GCGKSSLLSA LLAEMDKVBG HVAIKGSVAY VPQQAWIQND 720 

SLRENILFGC QLEEPYYRSV IQACALLPDL EXLPSGDRTB IGEKGVNLSG GQKQRVSLAR 780 

AVYSNADIYL PDDPLSAVDA HVGKHIFENV IGPKGMLKNK TRILVTHSHS YLPQVDVUV 840 

KSGGKISEMG SYQKLLARDG AFAEPLRTYA STBQEQDAEE NGVTGVSGPG KEAKQMENGH 900 

LVTDSAGKQL QRQLSSSSSY SGDISRHHNS TAELQKAEAK KEBTWKLMEA DKAQTGQVKL 960 

SVYWDYMKAI GLPISFLSIF LFMCNHVSAL ASNYWLSLWT DDPIVNGTQE HTKVRLSVYG 1020 

ALGISQGIAV PGYSMAVSIG GILASRCLHV DLLHSILRSP MSFFERTPSG NLVNRFSKEL 1080 

DTVDSKIPEV IKMFMGSLFN VZGACIVILL ATPIAAIIIP PLGLIYFFVQ RFYVASSRQL 1140 

KRLESVSRSP VYSHFNETLL GVSVIRAFEB QERFIHQSDL KVDENQKAYY PSIVANRWLA 1200 

VRLECVGNCI VLFAALFAVI SRHSLSAGLV GLSVSYSLQV TTVLNWLVRM SSEMETNIVA 1260 

VKRLKKYSBT EKEAPWQIQK TAPPSSWPQV GRVEFRNYCL RYREDLDFVL RHISVTHK3G 1320 

BKVGIVGRTG AGKS5LTLGL FRIKESAEGB ZZIOGZNIAK IGLHDLRFKI TIIPQDPVLP 1380 

SGSLRMNLDP PSQYSDEEVW TSLELAHLKD PVSALPDKLD HECAEQGEHL SVGQRQLVCL 1440 

ARALLRKTKI LVLDEATAAV DLETDDLIQS TIRTQFEDCT VLTIAHRLNT IMDYTKVTVL 1500 
DKGEIQEYGA PSDLLQQRGL FYSMAKDAGL V 
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SEQ ID NO:23 PAA2 DNA SEQUENCE 

Nuctec Add Accession* NM.013S09 

Coding sequence: M280 (undefined sequences correspond to siart and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGCCGOCT CTGGCGCOTG GAAGCGCCTC AAATCTATGC TAAGGAAGGA TGATGCGCCG 60 

CTGTTTTTAA ATGACACCAG CGCCTTTGAC TTCTCGGATG AGGCGGGGGA CGA66GGCTT 120 

TCTCGGTTCA ACAAACTTCG AGTTGTGGTG GCCGATGACG GTTCCGAAGC CCCGOAAAGG 180 

CCTGTTAACG GGGCGCACCC GACCCTCCAG GCCGACGATG ATTCCTTACT GGACCAAGAC 240 

TTACCrTTOA CCAACAGTCA GCTCAGTTTG AAGGTGGACT CCTGTGACAA CTGCAGCAAA 300 

CAGAGAGAGA TACTGAAGCA GAGAAAGGTG AAAGCCAGGT T6ACCATTQC TGCCGTTCTG 360 

TACTTGCTTT TCATGATTGG AGAACTTGTA GGTGGATACA TTGCAAATAQ CCTAGCAATC 420 

ATGACAGATG CACTTCATAT GTTAACTGAC CTAAGCGCCA TCATACTCAC CCTGCTTGCT 480 

TTGTGGCTAT CATCAAAATC AOCAACCAAA AGATTCACCT TTGQATTTCA TCGCTTAGAG 540 

GTTTTCTCAO CTATGATTAG TGTGCTGTTG GTGTATATAC TTAT60CATT CCTCTTATAT 600 

GAAGCTGTGC AAAGAACTAT CCATATCAAC TATGAAATAA ATGGAGATAT AATGCTCATC 660 

ACCGCAGCTG TTGGAGTTGC AGTTAATGTA ATAATGGGGT TTCTGTTGAA OCAGTCTGGT 720 

CACCGTCACT CCCATTCCCA CTCCCT GC CT TCAAATTCCC CTACCAGAOG TTCTGGGTGT 780 

GAACGSAACC ATGGGCAGGA TAGCCTGGCA GTGAGAGCTG CATTTGTACA TGCTTTGGGA 840 

GATTTGGTAC AGAGTGTTGG TGTGCTAATA GCTGCATACA TCATACGATT CAAGCCAGAA 900 

SACAAGATTG CTGATCCCAT CTGTACATAC GTATTTTCAT TACTTGTGGC TTTTACAACA 960 

TTTCGAATCA TATGGGATAC AGTAGTTATA ATACTAGAAG GTGT6CCAAG CCATTTGAAT 1020 

GTAQACTATA TCAAAGAAGC CTTGATOAAA ATAGAAGATG TATATTCAGT CGAAQATTTA 1080 

AATATCTCGT CTCTCACTTC AGGAAAATCT ACTGCCATAG TTCACATACA GCTAATTCCT 1140 

GGAAGTTCAT CTAAATGGGA GGAAGTACAG TCCAAAGCAA ACCATTTATT ATTGAACACA 1200 

TTTGGCATGT ATAGATGTAC TATTCAGCTT CAGAGTTACA GGCAAGAAGT GGACAGAACT 1260 
TGTGCAAATT GTGAGAGTTC TAGTCCCTGA 



Pn^Aoe^on^f ^ NP L037441 

1 11 21 31 41 51 

I I I I I I 

MAGSGAWKRL KSMLRKDDAP LFI2JBTSAFD PSDEAGDEGL SRFNKLKWV ADDGSEAPER 60 

PVNGAHPTLQ ADDDSLLDQD LFLTNSQLSL KVDSCDNCSK QRBILKQRKV KARLTIAAVL 120 

YLLFKIGELV GGYIANSLAI KTDALHMLTD LSAIILTLLA LWLSSKSPTK RPTPGFHRLE 180 

VLSAKZ5VLL VYILMOFLLY EAVQRTIHMN YEXNGDXMLI TAAVGVAVNV MGFLLNQSG 240 

HRHSHSHSLP SKSPTRGSGC ERNHGQDSLA VRAAFVEALO DLVQSVGVLI AAYIIKFKPE 300 

YKXADPICTY VPSLLVAFTT FRXZWDTWI ILEGVPSHLW VDYIKEALMK XEDVYSVEDL 360 

NIWSI/TSGKS TAIVHIQLIP GSSSKWEEVQ SKANHLLLNT FGHYRCTIQL QSYRQEVDRT 420 
CANCQSSSP 



SEQ 0) N025 PAA3 DNA SEQUENCE 

Nucleic Add Accession #: AB037765 

Coding sequence: 375-2788 (underlined sequences correspond to start and stop colons) 



1 11 21 31 41 51 

1 I 1 I I I 

GCCGAGTCGG TGGCQGCTGC AGGCTGGGAG GGAGAAGTGC TACGCCTTTQ CAGGTTGGCG 60 

AAGTGGTTCC AGGCTACCCG GCTAGTCTGG CACGGCCCCG TC1TC TCC CT CCTCCTCCGT 120 

CGOGTGGOGG OGGGAACTGT TGGCCGCGCG GCCTCGGGAA CGGCCCAGGT OCCCGCCCGC 180 

AGGTOCCGGG CAGATAACAT AGATCATCAG TAGAAAACTT CTTGAAGTTG TTCAAGAAAA 240 

ATTTGAAAOT AGCAAAATAG AAAATAAAGA ATTAACAGCA GATACAGAGG ACAGCATGGA 300 

AGTGTTGTCT TAGGAAACAG AACACAGCAG TGAAAAAACA GACAAAATCC GCTCAGATAC 360 

AACTGCAGCT GATAATGTTT TCCGGCTTCA ATGTCTTTAG AGTTGGGATC TCTTTTGTCA 420 

TAATGTGCAT TTTTTACATG CCAACAGTAA ACTCTTTAOC AGAACTGAGT CCTCAGAAAT 480 

ATTTTAGTAC ATTGCAACCA GGTCTTGAAG AACTGAATGA GGCTGTTAGA CCTCTGCAGG 540 

ACTATGGAAT TTCAGTTGCC AAGGTTAATT GTGTCAAAGA AGAAATATCA A6ATACTGTG 600 

GAAAAGAAAA GGATTTGATG AAAGCATATT TATTCAAGGG CAACATATTG CTCAGAGAAT 660 

TCCCTACTGA CACCTTGTTT GATGTGAATG CCATTGTCGC CCATGTTCTC TTTGCTCTTC 720 

TTTTTAGTGA AGTGAAATAT ATTACCAACC TGGAAGACCT TCAGAACATA GAAAATGCTC 780 

TGAAAGGAAA AGCAAATATT ATATTCTCAT ATGTAAGAGC CATTGGAATA CCAGAGCACA 840 

GAGCAGTCAT GGAAGCCGGT TTTGTGTATO GGACTACAXA CCAATTTGTC TTAACCACAG 900 

AAATTGCCCT TTTGGAAAGT ATTGGCTCTG AGGATGTGGA ATATGCACAT CTCTACTTTT 960 

TTCATTGTAA ACTAGTCTTG GACTTGAOOC AGCAATGTAG AAGAACACTA ATGGAACAGC 1020 

CATTGACTAC ACTGAACATT CAO CTQTT TA TTAAGACAAT GAAAGCACCT CTGTTGACTG 1080 

AAGTTGCTOA AGATCCTCAA CAAGTTTCAA CTGTOCATCT CCAACTGGGC TTACCACTGG 1140 

TTTTTATTGT TAGOCAACAG GCTACTTATG AAGCTGATAG AAGAACTGCA GAATGGGTTG 1200 

CTTGGCGTCT TCTGGGAAAA GCAGGAGTTC TACTCTTGTT AA6GGACTCT TTGGAAGTGA 1260 

ACATTCCTCA AGATGCTAAT GTGGTCTTCA AAAGAGCAGA AGAGGGAGCT CCAGTGGAAT 1320 

TTTTGGTATT ACATGATGTT GATTTAATAA TATCTCATGT GGAAAATAA7 ATGCACATTG 1380 

AGGAAATACA AGAAGATGAA GACAATGACA TGGAAGGTCC AGATATAGAT GTTCAGGATQ 1440 

ATGAAGTGGC AGAAACTGTT TTCAGAGATA GGAAGAGAAA ATTACCTTTG GAACTTACAG 1500 
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TGGAACTAAC AGAAGAAACA TTTAATGCAA CAGTGATGGC TTCTGACAGC ATAOTACTCT 1560 

TCTATGCTOO TTGGCAAGCA GTATCCATGG CATTTTTGCA ATCCTATATT GATGTGGCAO 1620 

TTAAACTGAA ASGCACATCT ACTATGCTTC TTACTAOAAT AAACTGTGCA GATTGGTCTG 1680 

ATGTATGTAC TAAGCAAAAT GTTACTGAAT TTCCTATCAT AAAGATGTAC AAGAAAGGCG 1740 

AGAACCCAGT ATCTTATGCT GGAATGTTAG GAACCAAAGA TCTCCTAAAA TTTATCCAGC 1800 

TCAAC AGGAT TTCATATCCA GTGAATATAA CATCGATCCA AGAAGCAGAA GAATATTTAA I860 

GTOGOCAATT ATATAAAGAC CTCATCTTGT ATTCTAQTGT GTCAGTATTG GGACTATTTA 1920 

GTCCAACCAT GAAAACAGCA AAAGAAGATT TTAGTGAAGC AGGAAACTAC CTAAAAGGAT 1980 

ATGTTATCAC TGGAATTTAT TCTGAAGAAG ATGTTTTGCT ACTGTCAACC AAATATGCTG 2040 

CAAGTCTTCC AGCCCTGCTG CTTGCCAGAC ACACAGAAGQ CAAAATAGAG AGCATCCCAC 2100 

TAGCTAGCAC ACATGCACAA GACATAGTTC AAATAATAAC AGATGCACTA CTGGAAATGT 2160 

TTCCGGAAAT CACTGTGGAA AATCTTCCCA GTTATTTCAG ACTTCAGAAA CCATTATTGA 2220 

TTTTGTTCAG TGATGGCACT GTAAATCCTC AATATAAAAA AGCAATATTG ACACTGGTAA 2280 

AGCAGAAATA CTTGGATTCA TTTACTCCAT GCTGGTTAAA TCTAAAGAAT ACTCCAGTGG 2340 

GGAGAGGAAT CTTGCGGGCA TATTTTGATC CTCTGCCTCC OC TT OC T C TT CTTGTTTTGG 2400 

TGAATCTGCA TTCAGGTGGC CAAGTATTTG CATTTCCTTC AGACCAGGCT ATAATTGAAG 2460 

AAAACCTTGT ATTGTGGCTG AAGAAATTAG AAGCAGGACT AGAAAATCAT ATCACAATTT 2520 

TACCTGCTCA AGAATGGAAA CCTCCTCTTC CAGCTTATGA TTTTCTAAGT ATGATAGATG 2580 

CCGCAACATC TCAACOTGGC ACTAGGAAAG TTCCCAAGTG TATGAAAGAA ACAGATGTGC 2640 

AGGAGAATGA TAAGGAACAA CATGAAGATA AATCGGCAGT CAGAAAAGAA CCGATTGAAA 2700 

CTCTGAGAAT AAAGCATTGG AATAGAAGTA ATTGGTTTAA AGAAGCAGAA AAATCATTTA 2760 

GACGTGATAA AGAGTTAGGA TGCTC AAAAG TGAACTAATT TTATAGGGCT GTGGTTTCCA 2820 

AAATTTTTTT GGCATGAXAG ACTTAATTTA TTTCCTTAAA GAATAATATT AAATCATTTC 2880 

AAGTTTGCAG ACTAGTGCCA TCCAATAGAA TTATAATATA AGTCACATAT TTTATTTAAA 2940 

ATTTTCTAGT AACTACATTA AACAAAGTAA AAGTGAGCAG GGCAAAATAA TTTTGATATT 3000 

ACTTTTCACC CAGTAGTATA CCCAAAATAG CQAAATATAQ AAATTATTAA TGAGATATTT 3060 

TACATCCTTT TTTGTACCAA GTCTTCTAAA TGCAGTACAT ATTTTATACT TACTGCATTT 3120 

CTTACTTCCG AGTAGCCATA TTTCAAGTGT TCATTGCCAC ATGTGGCCTQ TGACTACTGT 3180 

ATTGGACAGT TCAGTACTAG ACAAAAACTA GCATAATTAA CTTAGTTCTA GCCATGATTT 3240 

CTATTTGGAT TAAAATTAAA CTCTAATCAC AGTTAACTCC ACAGTGCATT CATGCAGCTG 3300 

ACAGTTATAT TTGTTTTATT GGAGTCATGA TATTAAAATC AGC G TTTGTC AACCTCAGGG 3360 

GATATTTAGC AATTGTCGGG AGACATTTTT GATGTCATGA CTAGGGCAGT TATTGACATT 3420 

TAGTGAGTAG AGGCCATGGA TCCTGCTAAA TAACCTGCAT TGGACAGCGC CCCACAACAA 3480 

AGAATTATCC TGCCCGAAAT GGTAGTCGTG CCAAGGCTGA GTAACCTTGT GTTAAAAGTA 3540 

ACCTGTGGCA GACTAGGTTT CCAGAATTTC CT GGT TCTGC TCACGTATCA TGTTTGAAAA 3600 

AATTTTGGCT ATTAAAGATA TGTATTAGAT GGTCTTATCC TGATTATTAC CTGGATACAA 3660 

CTTGATCTTT TCTAATATTT TCAGAAAGTG ATGGGATAAC CCTAGAAGAG GACTCAGAAT 3720 

GATATTTATA TTTTAAGTGA GTCTTAAAAC CTCCTCTTAT TTCTACAAGT TATATGGCTA 3760 

AATTTCAGAT TGAACAGGGA TTCAGCATTC TGCCATCTCC TCATGGAAAG AGAGGCTCCC 3840 

TCATCTGAAG CGTCTCTGAA ATCTACCCTT GCAAGCTTCA GACAAATCAG TTGATCTCCC 3900 

TGAGCCACAC GGCCTCATTC TGTGAGGGAG GGAAAGATTA GCCAAAGAGT TAATTTTCAT 3960 

TCCAAATCAC TTAGCTGTTA GACTGATCTG TTTGTAGCAG TTGTTTOTCT CATTTTTGCT 4020 

CTGTGCATTT TTTGAGACAT TTGTTGAGAA TATTCTATTr GGTGCTCTAC TGTATTTTTC 4080 

TTTTTAATAT CTACTTGATA TCTTCT T C TT TAAATTTTCT TCACATATGG TTTGCCTGAT 4140 

ACAACTGATT TTTATAACTG AAATTTAAGG AATCTAACAG CTAAAACTCA GTAAGTGCAT 4200 

MIATT TCCri 1 ATAACATAGA CCCGTTGCTA CTCTCAGCAC CCTCTCCTCA ATTTTTTTTC 4260 

CTGTAGCATG TGATGCCTGA TTAAACTCAT TTTCATTTGC TTTTATTTCT AATATGGGAA 4320 

CAATCAGAGT GAACTCTAAA TATAGGTTGT AGTAATAAAA CATCATTAGC CTAATTATTA 4360 

GAAAATGCTA ATTAAGTACC AGCACATAGA AACATGAAAT TGCTTAGTCA TTGTACCTTT 4440 

GTCAGCAATT TTGACAGTCA TTAATGTTTG TCATAATTTT AAATAAAGTG TCTGGGTTTC 4500 
AGAATACCTT CAAAAAAAAA AAAAAA 

8EQn>N0a6 PAA3 Proton seouence 
Protein Accession*: BAA92582 

1 11 21 31 41 51 

I I I I I I 

MFSGFNVFKV GZSFVZKCIF YHPTVNSLPE LSPQKYFSTL QPGLEELNEA VRPLQDYGIS 60 

VAKVNCVKEE ISRYCGKEKD LMKAYLFKGN ILLREFPTDT LFDVNAIVAH VLFALLFSEV 120 

KYITNLEDLQ NIBNALKGKA KIIFSYVRAI GIPEHRAVMB AGFVYGTTYQ FVLTTEIALL 180 

ESIGSEDVEY AHLYPFHCKL VLDLTQQCRR TLMEQPLTTL NIHLFIKTMK APLLTEVAED 240 

PQQVSTVHLQ -LGLPLVFXVS QQATYEADER TAEWVAWRLL GKAGVLLLLR DSLEVNIPQD 300 

ANWFKRAEE GVPVBFLVLH DVDLIISHVE MNMHIEBIQB DEDNDMEGPD IDVQDDEVAE 360 

TVFRDKKRKL PLBLTVELTE ETFNATVMAS DSXVLFYAGW QAVSMAELQS YIDVAVKLKG 420 

TSTMLLTRIN CADWSDVCTK QNVTEFPIIK MYKKGENPVS YAGMLGTKDL LKPTQLHRIS 480 

YPVNITSIQE AEEYLSGELY KDLILYSSVS VLGLFSPTMK TAKEDFSEAG NYLKGYVTTG 540 

IY5EE0VLLL STKYAASLPA LLLARHTEGK IBSIPLASTH AQDZVQIITD ALLEMPPEIT 600 

VENLPSYFRL QKPLLILFSD GTVNPQYKKA ILTLVKQKYL DSFTPCWLNL KNTPVGRGIL 660 

RAYFDPLPPL PLLVLVNLHS GGQVFAFPSD QAXIEENLVL WLKKLEAGLE NHITILPAQE 720 

WKPPLPAYDF LSKZDAATSQ RGTRKVPKCM KETDVQENDK EQHEDKSAVR KBPIBTLRIK 780 
HWNRSNWFKE AEKSFRRDKE LGCSKVN 

SEQ ID NCfc27 PAA5 DMA SEQUENCE 

Nudete Add Accession*: NM_012449 

Cooing sequence: 66*1085 (underOned sequaices c&respond tD start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CCGAGACTCA CGGTCAAGCT AAGGCGAAGA GTGGGTGGCT GAAGCCATAC TATTTTATAG 60 

AATTAATGGA AAGCAGAAAA GACATCACAA ACCAAGAAGA ACTTTGGAAA ATGAAGCCTA 120 
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GGAGAAATTT AGAAGAAGAC GATTATTTGC ATAAGGACAC GGGAGAGACC AGCATGCTAA 180 

AAAGACCTGT GCTTTTGCAT TTGCACCAAA CAGCCCATGC TQATGAATTT GACTGCCCTT 240 

CAGAACTTCA GCACACACAG GAACTCTTTC CACAGTGGCA CTTGCCAATT AAAATAGCTG 300 

CTATTATAGC ATCTCTGACT TTTCTTTACA CTCTTCTGAG GGAAGTAATT CACCCTTTAG 360 

CAACTTCCCA TCAACAATAT TTTTATAAAA TTCCAATCCT GGTCATCAAC AAAGTCTTGC 420 

CAATGGTTTC CATCACTCTC TTGGCATTGG TTTACCTGCC AGGTGTGATA GCAGCAATTG 480 

TCCAACITCA TAATGGAACC AAGTATAAGA AGTTTCCACA TTGGTTGGAT AAGTGGATGT 540 

TAACAAGAAA GCAGTTTGGG CTTCTCAGTT TCTTTTTTGC TGTACTGCAT GCAATTTATA 600 

GTCTGTCTTA OCCAATGAGG CGATCCTACA GATACAAGTT GCTAAACTGG GCATATCAAC 660 

AGGTCCAACA AAATAAAGAA GATGCCTGGA TTGAGCATGA TGTTTGGAGA ATGGAGATTT 720 

ATGTGTCTCT GGGAATTGTG GGATTGGCAA TACTGGCTCT GTTGGCTGTG ACATCTATTC 780 

CATCTGTGAG TOACTCTTTG ACATGGAGAQ AATTTCACTA TATTCAGAGC AAGCTAGGAA 840 

TrGTTTCCCT TCTACTGGGC ACAATACACG CATTGATTTT TGCCTGGAAT AAGTGGATAG 900 

ATATAAAACA ATTTGTATGG TATACACCTC CAACTTTTAT GATAGCTGTT TTCCTTCCAA 960 

TTGTTGTCCT GATATTTAAA AGCATACTAT TCCTGCCATG CTTGAGGAAG AAGATACTGA 1020 

AGATTAGACA TGGTTGGGAA GACGTCACCA AAATTAACAA AACTGAGATA TGTTCCCAGT 1080 

TGTAGAATTA CTGTTTACAC ACATTTTTGT TCAATATTGA TATATTTTAT CACCAACATT 1140 
TCAAGTTTGT ATTTGTTAAT AAAATGATTA TTCAAGGAAA AAAAAAAAAA AAAAA 

8EQ ID MOaB PAAS Protein sequence 
Protein Accession*: np_036581 

1 11 21 31 41 51 

I I 1 I I I 

MESRKDITNQ KBLWKKKPRR NLEEDDYLHK DTGETSMLKR PVLLHLHQTA HADEFDCPSE 60 

LQHTQELFPQ WHLPIKIAAI IASLTFLYTL LREVTHPLAT SHQQYFYKIP ILVINKVLPM 120 

VSITLLALVY LPGVIAAIVQ LHNGTKYKKF PHWLDKWMLT RKQFGLLSFF FAVLHAIYSL 180 

SYPMRRSYRY KLLKWAYQQV QQNKEDAWIE HDVWRMEIYV SLGIVGLAIL ALLAVTSIPS 240 

VSDSLTWRBP EYIQSKLGIV SLLLGTIHAL IFAWNKWIDI KQFVWYTPPT FMIAVFLPIV 300 
VLIFKSILPL PCLRKKILKI RHGWEBVTKI NKTEICSQL 

SEQ ID NO:29 PAA7 DMA SEQUENCE 

Nucleic Add Accession t. NM.030774 

Coding sequence: 1*963 (undefined sequences correspond to start end stop codons) 

1 11 21 31 41 51 

I I I II I 

ATGAGTTCCT GCAACTTCAC ACATGCCACC TTTGTGCTTA TTGGTATCCC AGGATTAGAG 60 

AAAGCCCATT TCTGGGTTGG CTTCCCCCTC CTTTCCATGT ATGTAGTGGC AATGTTTGGA 120 

AACTGCATCG TGGTCTTCAT CGTAAGGAOG GAACGCAGCC TGCACGCTCC GATGTACCTC 180 

TTTCTCTGCA TGCTTGCAGC CATTGACCTG GCCTTATCCA CATCCACCAT GCCTAAGATC 240 

crrooccTiT tctgotttga ttcccgagag attagctttg AGGCCTGTCT TACCCAGATG 300 

TTCTTTATTC ATGCCCTCTC AGCCATTGAA TCCACCATCC TGCTGGCCAT GGCCTTTGAC 360 

CGTTATGTGG CCATCTGCCA CCCACTGOGC CATGCTGCAG TGCTCAACAA TACAGTAACA 420 

GCCCAGATTG GCATCGTGGC TGTGGTCCGC GGATCCCTCT TTTTTTTCCC ACTGCCTCTG 480 

CTGATCAAGG GGCTGGCCTT CTGCCACTCC AATGTCCTCT CGCACTCCTA TTGTGTCCAC 540 

CAGGATGTAA TGAAGTTGGC CTATGCAGAC ACTTTGCCCA ATGTGCTATA TGGTCTTACT 600 

GCCATTCTGC TGGTCATGGG CGTGGACGTA ATGTTCATCT CCTTGTCCTA TTTTCTGATA 660 

AIACGAACGG TTCTGCAACT GOCTTCCAAG TCAGAGCGGG CCAAGGCCTT TGGAACCTGT 720 

GTGTCACACA TTGGTGTGGT ACTCOOCTTC TATGTGCCAC TTATTGGCCT CTCAGTGGTA 780 

CACCGCTTTG GAAACAGCCT TCATCCCATT GTGCGTGTTG 7CATGGGTGA CATCTACCTG 840 

CTGCTGCCTC CTGTCATCAA TCCCATCATC TATGGTGCCA AAACCAAACA GATCAGAACA 900 

CGGGTGCTGG CTATGTTCAA GATCAGCTGT GACAAGGACT TGCAGGCTGT GGGAGGCAAG 960 

TGA CCCTTAA CACTACACTT CTCCTTATCT TTOSFF9GCTT GATAAACATA ATTATTTCTA 1020 

ACACTAGCTT ATTTCCAGTT GCCCATAAGC ACATCAGTAC TTTTCTCTGG CTGGAATAGT 1080 

AAACTAAAGT ATGGTACATC TACCTAAAGG ACTATTATGT GGAATAATAC ATACTAATGA 1140 

AGTATTACAT GATTTAAAGA CTACAATAAA ACCAAACATG CTTATAACAT TAAGAAAAAC 1200 

AATAAAGATA CATGATTGAA ACCAAGTTGA AAAATAGCAT ATGCCTTGGA GGAAATGTGC 1260 

TCAAATTACT AATGATTTAG TGTTGTCCCT ACTTTCTCTC TCTTTTTTCT TTCTTTTTTT 1320 

TTTATTATGG TTAGCTGTCA CATACAACTT CTTTTCTTCT TGAGATGGGG TCTCGCTCTG 1380 

TCACCAGGCT GGAGTGCAGT GGCGCGATCT CGGCTCACTG CAACCTCCAC ATCCCATGTT 1440 

GAAGTAATTC TTCTGCCTCA GCCTCCCGAG TAGCTGGGAC TAGAGGAACG TGCCACCATG 1500 

ACTGGCTAAT TTTCTGTATT TTTTAGTAGA GACAGAGTTT CACCATGTTG GCCAGGATGG 1560 

TCTCGATCTC CTGACCTTGT GATCCACCCG CCTCAGCCTC CCAAAGTGTT GGGATTACAG 1620 

GTGTGAACCA CTGTGCCCGG CCTGTGTACA ACTTTTTAAA TAGGGAATAT GATAGCTTCG 1680 

CATGGTGGTG TGCACCTATA GCCCCCACTG CCTGGAAAGC TGAGGTGGGA GAATCGCTTG 1740 

AGTCCAGGAG TTTGAGGTTA CAGTGATCCA CGATCGTACC ACTACACTCC AGCCTGGGCA 1800 

ACAGAGCAAG ACCCTGTCTC AAAGCATAAA ATGGAATAAC ATATCAAATG AAACAGGGAA I860 

AATGAAGCTG ACAATTTATG GAAGCCAGGG CTTGTCACAG TCTCTACTGT TATTATGCAT 1920 

TACCTGGGAA TTTATATAAG CCCTTAATAA TARTGCCAAT GAACATCTCA TGTGTGCTCA 1980 

CAATGTTCTG GCACTATTAT AAGTGCTTCA CAGGTTTTAT GTGTTCTTCG TAACTTTATG 2040 

GAGTAGGTAC CATTTGTGTC TCTTTATTAT AAGTGAGAGA AATGAAGTTT ATATTATCAA 2100 

GGGGACTAAA GTCACACGGC TTGTGGGCAC TGTGCCAAGA TTTAAAATTA AATTTGATGG 2160 

TTGAATACAG TTACTTAATG ACCATGTTAT ATTGCTTCCT GTGTAACATC TGCCATTTAT 2220 

TTCCTCAGCT GTACAAATCC TCTGTTTTCT CTCTGTTACA CACTAACATC AATGGCTTTG 2280 

TACTTGTGAT GAGAGATAAC CTTGCCCTAG TTOTGGGCAA CACATGCAGA ATAATCCTGT 2340 

TTTACAGCTG CCTTTCGTGA TCTTAJTGCT TGCTTTTTIC CAGATTCAGG GAGAATGTTG 2400 

TTGTCTATTT GTCTCTTACA TCTCCTTGAT CATGTCTTCA TTTTTTAATG TGCTCTGTAC 2460 

CTGTCAAAAA TTTTGAATGT ACACCACATG CTATTGTCTG AACTTGAGTA TAAGATAAAA 2520 
TAAAATTTTA TTTTAAATTT T 
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SEP ID NO:30 PAA7 PROTEIN SEQUENCE 

Protein Accession* NP_H040l 

1 11 21 31 41 51 

I I I I I I 

MSSCOTTHAT FVLIGIFGLB KAHFWVGPPL LSHYWAMFG NCIWPIVRT ERSLHAPMYL 60 
FLCKLAA1DL ALSTSTHPKI IALFWPDSRB ISFEACLTQM FFIHALSAIE STILLAMAFD 120 
RYVAICHPLR HAAVLHNTVT AQIGIVAWR GSLFFFPLPL LIKRLAFCHS NVLSHSYCVH 180 
QDVMKLAYAD TLPNWYGLT AILLVKGVDV MFISLSYFLI IRTVLQLPSK SERAKAPGTC 240 
VSHIGWIAF YVPLIGLSW HRFGNSLHPI VRWKGDIYL LLPPVIUPII YGAKTKQIHT 300 
RVLAMFKISC DKDLQAVGGK 

SEQ ID K031 PAVB DNA SEQUENCE 

Nuclefc Add Accession* XM.050837 

Coding sequence: 1-1 020 (underfilled sequences ccrrB^xmd to start and stopcodons) 

1 11 21 31 41 51 

I I I I I I 

ATGftACTGGQ AGCTGCTGCT QTGOCTGCTO GTQCTQTOCQ COCTGCTCCT GCTCTTGGTO 60 

CAGCTGCTGC GCTTCCTGAG GGCTGAGGGC GACCTGACGC TACTATGGGC CGAGTGGCAG 120 

GGACGACGCC CAGAATGGGA GCTGACTGAT ATGGTGGTGT GGGTGACTGG AGCCTCGAGT 180 

GGAATTGGTG AGGAGCTGGC TTACCAGTTG TCTAAACTAG GAGTTTCTCT TGTGCTGTCA 240 

GCCAGAAGAQ TGCATGAGCT GGAAAGGGTG AAAAGAAGAT GCCTAGAQAA TGGCAATTTA 300 

AAAGAAAAAG ATATACTTGT TO OOCOC TT GACCTGAOCG ACACTGGTTC CCATGAAGCG 360 

GCTACCAAAO CTOTTCTCCA GGAGTTTGGT AGAATC6ACA TTCTGGTCAA CAATGGTGGA 420 

ATGTCCCAGC OTTCTCTGTG CATGGATACC AGCTTQGATG TCTACAGAAA GCTAATAGAG 480 

CTTAACTACT TAGGGACGGT GTCCTTGACA AAATGTGTTC TGCCTCACAT GATCGAGAGG 540 

AAGCAAGGAA AGATTGTTAC TGTGAATAGC ATCCTGGGTA TCATATCTGT ACCTCTTTOC 600 

ATTGGATACT GTGCTAGCAA GCATGCTCTC CGGGGTTTTT TTAATGGCCT TCGAACAGAA 660 

CTTGCCACAT ACCCAGGTAT AATAGTTTCT AACATTTGCC CAGGACCTGT GCAATCAAAT 720 

ATTGTGGAGA ATTCCCTAGC TGGAGAAGTC ACAAAGACTA TAGGCAATAA TGGAGACCAG 780 

TOCCACAAGA TGACAACCAG TCGTTGTGTG CGGCTGATGT TAATCAGCAT GGCCAATGAT 840 

TTGAAAGAAG TTTGGATCTC AGAACAACCT TTCTTGTTAG TAACATATTT GTGGCAATAC 900 

ATGCCAACCT GGGCCTGGTG GATAACCAAC AAGATGGGOA AGAAAAGGAT TGAGAACTTT 960 
AAGAGTGGTG TGGATGCAGA CTCTTCTTAT TTPAAAATCT TTAAGACAAA ACATQA CTGA 

SEQ ID K032 PAV6 Protein sequence 
Protein Accessions XP_050837 

1 11 21 31 41 51 

I I I I I I 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTGASS 60 

GTGEELAYQL. SKLGVSLVLS ARRVHELERV KRRCLBNGNL KEKDILVLPL DLTDTGSHEA 120 

ATKAVLQEFG RIDILVNNGG MSQRSLCMDT SLDVYRKLXE LNYLGTVSLT RCVLPHMZER 180 

KQGKZVTVNS ILGIISVPLS IGYCASKHAL RGFFNGLRTE LATYPGIIVS NTCPGFVQSN 240 

I VEKS LAGEV TKTIGNNGDQ SHKMTTSRCV RLMLISMASD LKEVWISEQP FLLVTYLWQY 300 
MPTWAWWITN KMOKKRJENF KSGVDADSSY FKIFKTKHD 

SEQ D N033 PBA6 DNA SEQUENCE 

Nucteic Add Accession*: NMJ006853 

Coding sequence: 25-874 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

AGGAATCTGC GCTCGGGTTC CGCAGATGCA GAGGTTGAGG TGGCTGCGGG ACTGGAAGTC 60 

ATCGGGCAGA GGTCTCACAG CAGCCAAGGA ACCTGGGGOC CGCTCCTCCC CCCTCCAGGC 120 

CATGAGGATT CTGCAGTTAA TCCTGCTTGC TCTGGCAACA GGGCTTGTAG GGGGAGAGAC 180 

CAGGATCATC AAGGGGTTCG AGTGCAAGCC TCACTCCCAG GCCTGGCAGG CAGCCCTGTT 240 

CGAGAAGACG CG6CTACTCT GTGGGGCGAC GCTCATCGCC CCCAGATGGC TCCTGACAGC 300 

AGCCCACTGC CTCAAGCCCC GCTACATAGT TCACCTGGGG CAGCACAACC TCCAGAAGGA 360 

GGAGGGCTGT GAGCAGACCC GGACAGCCAC TGAGTCCTTC CCCCACCCCG GCTTCAACAA 420 

CAGCCTCCCC AACAAAGACC ACCGCAATGA CATCATGCTG GTGAAGATGG CATCGCCAGT 480 

CTCCATCACC TGGGCTGTGC GACCCCTCAC CCTCT C CTCA CGCTGTGTCA CTGCTGGCAC 540 

CAGCTGOCTC ATTTCCGGCT GGGGCAGCAC GTCCAGCCCC CAGTTACGCC TGCCTCACAC 600 

CTTGCGATGC GCCAACATCA CCATCATTGA GCACCAGAAG TGTGAGAACG CCTACCCCGG 660 

CAACATCACA GACACCATGG TGTGTGCCAG CGTGCAGGAA GGGGGCAAGG ACTCCTGCCA 720 

GGGTGACTGC GGGGGCCCTC TGGTCTQTAA CCAGTCTCTT CAAGGCATTA TCTCCTGGGG 780 

CCAGGATCCG TGTGCGATCA CCCGAAAGCC TGGTGTCTAC ACGAAAGTCT GCAAATATGT 840 

GGACTGGATC CAGGAGACGA TGAAGAACAA TTAGACTGGA OCCACCCACC ACAGCCCATC 900 

ACCCTCCATT TCCACTTGGT C TTTGGT TOC TGTTCACTCT GTTAATAAGA AACCCTAAGC 960 

CAAGACOCTC TACGAACATT CTTTGGGCCT CCTGGACTAC AGGAGATGCT GTCACTTAAT 1020 

AATCAACCTG GGGTTCGAAA TCAGTGAGAC CTGGATTCAA ATTCTGCCTT GAAATATTCT 1080 

GACTCTGGGA ATGACAACAC CTGGTTTGTT CTCTGTTGTA TCCCCAGCCC CAAAGACAGC 1140 
TCCTGGCCAT ATATCAAGGT TTCAATAAAT ATTTGCTAAA TGAGTG 

SEP ID NO;34 PBA6 PROTEIN SEQUENCE 

Protein Accession f. NPJ06844 
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1 11 21 31 41 51 

I I I I I I 

HRILQLIUA IATGLVGGET RIIKGFECKP HSQPWQAALF EKTRLLCGAT LXAFRWLLTA 60 

AHCLKPRyiV HLGQHNLQKE EGCEQTRTAT ESFPHPGPNN SLPNKDHRND IMLVKMASPV 120 

SITWAVRPLT LSSRCVTAGT SCLISGWGST 5SPQLRLPHT LRCANXTIIE HQKCENAYFC 180 

NITDTKVCAS VQEGGKDSCQ GDSGGPLVCK QSLQGIISWG QDPCAITRKP GVYTKVCKW 240 
DWIQETMKNN 

SEQ ID N0*5 PBC1 DMA SEQUENCE 

Nucleic Add Accession*: NM.001775 

Coding sequence 70-972 (tmdertnsd sequences correspond to start and stop codoos) 

1 11 21 31 41 51 

I I I I I I 

CTAAAGCTCT CTTGCTQCCT AGCCTCCTGC CGGCCTCATC TTCGCCCAGC CAACCCCGOC 60 

TGGAGCCC TA TGG CCAACTG CGAGTTCAGC CCG G TGTCOG GGGACAAACC CTGCTQCCGG 120 

CTCTCTAGGA GAGOCCAACT CTGTCTTGGC GTCAGTATCC TGGTOCTGAT OCTCGTCGTQ 180 

GTGCTOGOGG TGGTCGTOCC GAGGTGGCGC CAGACGTGGA GCGGTCCGGG CACCA0CAA6 240 

CGCTTTCCCG AGACCGTCCT GGCGCGATGC GTCAAGTACA CTGAAATTCA TCCTGAGATQ 300 

AGACATGTAG ACTGCCAAAG TGTATCGGAT 6CTTTCAA66 GTGCATTTAT TTCAAAACAT 360 

CCTTGCAACA TTACTGAAGA AGACTATCAG CCACTAATGA AGTTGGGAAC TCAGACCGTA 420 

CCTTGCAACA AGATTCTTCT TTGGAGCAGA ATAAAAGATC TG6CCCATCA GTTCACACAG 480 

GTCCAGCGGQ ACATGTTCAC CCTGGAGGAC ACGCTGCTAG 6CTACCTTGC TGATGACCTC 540 

ACATGGTGTG GTGAATTCAA CACTTCCAAA ATAAACTATC AATCTTGCCC AGACTGGAGA 600 

AAGGACTGCA GCAACAACCC TGTTTCAGTA TTCTGGAAAA CGCTTTCCCG CAGGTTTGCA 660 

GAAGCTGCCT GTGATGTGGT CCATGTGATG CTCAATGGAT CCCGCAGTAA AATCTTTGAC 720 

AAAAACAGCA CTTTTGGGAG TGTGGAAGTC CATAATTTGC AACCAGAGAA GGTTCAGACA 780 

CTAGAGGCCT GGGTGATACA TGGTGGAAGA GAAGATTCCA GAGACTTATG OCAGGATCCC 840 

ACCATAAAAO AGCTGGAATC GATTATAAGC AAAAGGAATA TTCAATTTTC CTGCAAGAAT 900 

ATCTACAGAC CTGACAAGTT TCTTCAGTGT GTQAAAAATC CTGAGGATTC ATCTTGCACA 960 

TCTQAGAT CT GAG CCAQTCQ CTGTGGTTGT TTTAGCTCCT TGACTCCTTG TGGTTTATGT 1020 

CATCATACAT GACTCAGCAT ACCTGCTGGT GCAGAGCTGA AGATTTTGGA GGGTCCTCCA 1080 

CAATAAGGTC AATGCCAGAG ACGGAAGCCT TTTTCCCCAA AGTCTTAAAA TAACTTATAT 1140 

CATCAQCATA CCTTTATTGT GATCTATCAA TAGTCAAGAA AAATTATTGT ATAAGATTAG 1200 
AATGAAAATT GTATGTTAAG TTACTTCCTT TAG 

SEQ PNQ^PPCIPnMi sequence. 
Protein Accession #: NP.001768 

1 11 21 31 41 51 

I I I I I I 

MANCEFSPVS GDKPCCRLSR RAQLCLGVSI LVLILVWLA WVPRWRQTW SGPGTTKRFP 60 

ETVLARCVKY TEIHPEMRHV DCQSVWDAFK GAPISKHPCM ITEEDYQPLM KLGTCTVPCN 120 

KILLWSRIKD LAHQFTCVQR DKFTLEDTLL GYLADDLTWC GEFNTSKINY QSCPDWRKDC 180 

SNNPVSVFWK TVSRRFABAA CDWHVMLNG SRSKIFDXNS TFGSVEVHNL QPEKVQTLEA 240 
NVXBSGRBDS RDLCQDPTTK ELBSIISKRN IQFSCKNIYR PDKFLQCVKN PEDSSCTSBX 

SEQ ID KO:37 P8H1 DNA SEQUENCE 

Nudelc Add Accession*: XMJ017718 

Cooing sequence: 1-3315 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGTCCTTTC GGGCAGCCAG GCTCAGCATG AGGAACAGAA GGAATGACAC TCTGGACAGC 60 

ACCCGGACCC TGTACTCCAG CGCGTCTCGG AGCACAGACT TGTCTTACAG TGAAAGCGAC 120 

TTGGTGAATT TTATTCAAGC AAATTTTAAG AAACGAGAAT GTGTC T TC TT TACCAAAGAT 180 

TCCAAGGCCA CGGAGAATGT GTGCAAGTGT GGCTATGCCC AGAGCCAGCA CATGGAAGGC 240 

ACCCAGATCA ACCAAAGTGA GAAATGGAAC TACAAGAAAC ACACCAAGGA ATTTCCTACC 300 

GACGCCTTTG GGGATATTCA GTTTGAGACA CTGGGGAAGA AAGGGAAGTA TATACGTCTG 360 

TCCTGCGACA CGGACGCGGA AATCCTTTAC GAGCTGCTGA CCCAGCACTG GCACCTGAAA 420 

ACACCCAACC TGGTCATTTC TGTGACCGGG GGCGCCAAGA ACTTCGCCCT GAAGCCGCGC 480 

ATGCGCAAGA TCTTCAGCCG GCTCATCTAC ATCGCGCAGT CCAAAGGTGC TTGGATTCTC 540 

ACGGGAGGCA CCCATTATGG CCTGATGAAG TACATCGGGG AGGTGGTGAG AGATAACACC 600 

ATCAGCAGGA GTTCAGAGGA GAATATTGTG GCCATTGGCA TAGCAGCTTG GGGCATGGTC 660 

TCCAACOGGG ACACCCTCAT CAGGAATTGC GATGCTGAGG GCTATTTTTT AGCCCAGTAC 720 

CTTATGGATG ACTTCACAAG AGATCCACTG TATATCCTGG ACAACAACCA CACACATTTG 780 

CTGCTCGTGG ACAATGGCTG TCATGGACAT CCCACTGTCG AAGCAAAGCT CCGGAATCAG 840 

CTAGAGAAGT ATATCTCTGA GCGCACTATT CAAGATTCCA ACTATGGTGG CAAGATCCCC 900 

ATTGTGTGTT TTGCCCAAGG AGGTGGAAAA GAGACTTTGA AAGCCATCAA TACCTCCATC 960 

AAAAATAAAA TTCCTTGTGT GGTGGTGGAA GGCTCGGGCC AGATCGCTGA TGTGATCGCT 1020 

AGCCTGGTGG AGGTGGAGGA TGCCCTGACA TCTTCTGCCG TCAAGGAGAA GCTGGTGCGC 1080 

TTTTTACCCC GCACGGTGTC CCGGCTGCCT GAGGAGGAGA CTGAGAGTTG GATCAAATGG 1140 

CTCAAAGAAA TTCTOGAATG TTCTCACCTA TTAACAGTTA TTAAAATGGA AGAAGCTGGG 1200 

GATGAAATTG TGAGCAATGC CATCTCCTAC GCTCTATACA AAGCCTTCAG CACCAGTGAG 1260 

CAAGACAAGG ATAACTGGAA TGGGCAGCTG AAGCTTCTGC TGGAGTGGAA CCAGCTGGAC 1320 

TTAGCCAATG ATGAGATTTT CACCAATGAC CGCCGATGGG AGTCTGCTGA CCTTCAAGAA 1380 

GTCATGTTTA CGGCTCTCAT AAAGGACAGA CCCAAGTTTG TCCGCCTCTT TCTGGAGAAT 1440 

GGCTTGAACC TACGGAAGTT TCTCACCCAT GATGTCCTCA CTGAACTCTT CTCCAACCAC 1500 

TTCAGCACGC TTGTGTACCG GAATCTGCAG ATCGCCAAGA ATTCCTATAA TGATGCCCTC 1560 
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CTCACOTTTG TCTGGAAACT GGTTGCGAAC TTCCGAAGAG GCTTOCGGAA GGAAGACAGA 1620 

AATGGCCGGQ ACGAGATGGA CATAQAACTC CACGACGTGT CTCCTATTAC TCOGCACCCC 1680 

CTGCAAGCTC TCTTCAICTG GGCCATTCTT CAGAATAAGA AGGAACTCTC CAAAGTCATT 1740 

TGGGAGCAGA CCAGGG6CT6 CACTCTGGCA GCCCTGGGAG CCAGCAACCT TCTGAAGACT 1800 

CTGGCCAAAG TGAAGAACGA CATCAATGCT GCTGGGGAGT CCGAGGAGCT GGCTAATGAG I860 

TACGAGACCC GGGCTGTTGA GCTGTTCACT GAGTGTTACA GCAGCGATQA AGACTTGGCA 1920 

GAACAGCTGC TGGTCTATTC CTOTGAAOCT TGGGGTGGAA GCAACTOTCT GGAGCTGGCG 1980 

GTGGAGGCCA CAGACCAGCA TTTCATCGCC CAGCCTGGGG TCCAGAATTT TCTTTCTAAG 2040 

CAATGGTATG GAGAGATTTC CCGAGACACC AAGAACTGGA AGATTATCCT GTGTCTOTTT 2100 

ATTATACCCT TGQT6GQCTQ TGGCTTTGTA TCATTTAGGA AGAAACCTOT CGACAA6CAC 2160 

AAGAAGCTGC TTTGGTACTA TGTGGCGTTC TTCACCTCCC CCTTCGTGCT CTTCTC CT OO 2220 

AATGTGGTCT TCTACATCGC CriCCTCC l tS CTGTTTGCCT ACGTGCTGCT CATGGATTTC 2280 

CATTCGGTGC CACACCCCCC OGA6CTGGTC CTGTACTCGC TGGTCTITCT OC T C TT C T S T 2340 

GATGAAGTGA QACAGTGGTA CGTAAATGGG GTGAATTATT TTACTGACCT GTGQAATQTG 2400 

ATGGACACGC TQGGGCTTTT TTACTTCATA GCAGGAATTG TATTTOGGCT CCACTCTTCT 2460 

AATAAAAGCT CTTTGTATTC TGGAOGAGTC AOTTTCTCTC TGGACTACAT TATTTTCACT 2520 

CTAAGATTGA TCCACAXTTT TACTGTAAGC AGAAACTTAG GACOCAAGAT TATAATGCTG 2580 

CAGAGGATGC TGATCGATGT tfl ' TCri C TTL ' CTGTTCCTCT TTGCGGTOTQ GATGGTGGCC 2640 

TTTGCCGTGG CCAGGCAAGG GATCCTTAGQ CAGAATGAGC AGC6CTGGAG GTGGATATTC 2700 

CGTTCGGTCA TCTACGAGCC CTAOCTGGCC ATGTTCGGCC AGGTGCCCAG TGACGTGGAT 2760 

GGTACCACGT ATGACTTTGC CCACTGCACC TTCACTGGGA ATGAGTCCAA GCCACTGTGT 2820 

GTGGAGCTGG ATGAGCACAA CCTGCOCCGO TTCOCCGAGT GOATCACCAT CCCCCTGGTG 2880 

TGCATCEACA TGTTATCCAC CAACATCCTO CTGGTCAACC TGCTGGTCGC CATGTTTGGC 2940 

TACACGGTGG GCACCGTCCA GGAGAACAAT GACCAGGTCT GGAAGTTCCA GAGGTACTTC 3000 

CTGGTGCAGG AGTACTGCAG CCGCCTCAAT ATCCCCTTCC CCTTCATCGT CTTCGCTTAC 3060 

TTCTACATGG TGGTGAAGAA GTGCTTCAAG TGTTGCTGCA AGGAGAAAAA CATGGAGTCT 3120 

TCTGTCT GC V GTTTCAAAAA TGAAGACAAT GAGACTCTGG CATGGGAGGG TGTCATGAAG 3180 

GAAAACTACC TTGTCAAGAT CAACACAAAA GCCAACGACA CCTCAGAGGA AATGAGGCAT 3240 

CGATTTAGAC AACTGGATAC AAAGCTTAAT GATCTCAAGG GTCTTCTGAA AGAGATTGCT 3300 
AATAAAATCA AATGA 

SEQ tD KO:38 PBH1 Protebi sequence 
Protein Accession*. XPJU7718 

1 11 21 31 41 51 

I I I I I i 

MSFRAARLSH RNRRNDTLDS TRTLYSSASR STDLSYSESD LVNFIQANPK KRECVPFTKD 60 

SKATENVCKC GYAQSQHMEG TQINQSEKWN YKKHTKEFPT DAFGDIQPET LGKKGKY1RL 120 

SCDTDABILY ELLTQHHHLK TPNLVXSVTG GAKNFALKPR MRKIPSRLIY IAQSKGAWH. 180 

TGGTHYGLMK YIGEWKDNT ISRSSEENIV AZGIAAWGKV SNHDTLXHNC DAEGYFLAQY 240 

LMDDFTRDPL YILDNNHTHL LLVCNGCHGH PTVEAKLRNQ LEKYISERTI QDSNYGGKIP 300 

IVCFAQGGGK ETLKAINTSI KNKIPCVWB GSGQIADVIA SLVEVEDALT SSAVKEKLVR 360 

FLPRTVSRLP EBBXESNZKN LKBILECSHL LTVIKMEEAG DEIVSNAISY ALYKAFSTSE 420 

QDKDNWNGQL KLLLEWNQLD LANDEIFTND RKWESADLQE VMFTALIKDR PKPVRLFLEN 480 

GLNLRKFLTH DVLTELFSNH FSTLVYRNLQ IAKNSYHDAL LTFVWKLVAN FRRGFRKEDR 540 

NGRDEMDIEL HDVSPITRHP LQALFIWAXL QNKKELSKVI WEQTRGCTLA ALGASKLLKT 600 

LAKVKNDXNA AGESEELANE YETRAVELFT ECYSSDEDLA EQLLVYSCEA WGGSNCLELA 660 

VEATDQHPIA QPGVQHFLSK QWYGEISRDT KNWKIILCLP XIPLVGCGPV SFRKKPVDKH 720 

KKLLWYYVAF FTSPFWFSW NWFYXAFLL LFAYVLLMDF HSVFHPPELV LYSLVFVLFC 780 

DEVRQWYVNG VNYFTDLWNV MDTLGLFYFI AGIVFRLHSS NKSSLYSGRV IPCLDYIIFT 840 

LRLIHIFTVS ENLGPKJIKL QRMLIDVFFF LPLFAVWKVA FGVARQGILR QNEQRKRWIF 900 

BSVIYEPYLA KPGQVPSDVD GTTYDFAHCT PTGNESKPLC VELDEHNLPR FPEWITIPLV 960 

CXYMLSTOIL LVNLLVAMFG YTVGTVQENN DQVWKFQRYF LVQEYCSM2I IPFPFIVFAY 1020 

FYKWKKCFK CCCKEKNHES SVCCPKKEDM ETLAWEGVHK ENYLVKINTK ANDTSEEMRH 1080 
RFRQLDTKLN DLKGLLKEIA NK1K 



SEQ ID NO:39 PBH3 DNA SEQUENCE 

Nucieta Add Accession ft XM.0U804 

Coding sequence: 1*558 (underlined sequences correspond to start end stop cottons) 

1 11 21 31 41 51 

I I I I I I 

ATGCCTCGOC TOTTCTTSTT CCACCTGCTA GAATTCTGTT TACTACTGAA CCAATTTTCC 60 

AGAGCAGTOG CGGCCAAATG GAAGGACGAT GTTATTAAAT TATGCGGCCG CGAATTAGTT 120 

CGCGCGCAGA TTGOCATTTG CGGCATGAGC ACCTGGAGCA AAAGGTCTCT GAGCCAGGAA 180 

GATGCTCCTC AGACACCTAG ACCAGTGGCA GAAATTGTAC CATCCTTCAT CAACAAAGAT 240 

ACAGAAACTA TAATTATCAT GTTGGAATTC ATTGCTAATT TGCCACCGGA GCTGAAGGCA 300 

GCCCTATCTG AGAGGCAACC ATCATTACCA GAGCTACAGC AGTATGTACC TGCATTAAAG 360 

GATTCCAATC TTAGCTTTGA AGAATTTAAG AAACTTATTC GCAAXAGGGA AAGTGAAGCC 420 

GCAGACAGCA ATCCTTCAGA ATTAAAATAC TTAGG CTTGG ATACTCATTC TCAAAAAAAG 480 

AGACGACCCT AOGTGGCACT GTTTGAGAAA TGTTGCCTAA TTGGTTGTAC CAAAAGGTCT 540 
CTTGCTAAAT ATTGCTGA 

SEP ID NO;40 PBH3 PROTEIN SEQUENCE 

Protein Accession* NP.008842 

1 11 21 31 41 51 

HPRLPLPHLL EFCLLLNQFS RAVAAKKKDD VTKLCGRELV RAQIAICGMS TWSKRSLSQE 60 
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DAPQTPRPVA EIVPSFINKD TBTZIZMLBF IANLPPELKA ALSERQPSLP ELQQYVPALK 120 

DSNLSFEEFK KLIKNRQSEA ADSNPSELKY LGLDTHSQKK RRFWALFEK CCLIGCTKRS 180 
LAKYC 

8GQI0N0:41 PBH5 DNA SEQUENCE 

Hudefc Add Accession* NMJW5845 

Coding sequence: 1-3978 (undenTiiedsequerees correspond to start and stop codons) 

1 11 21 31 41 51 

I I i I I I 

ATGCTGCCCG TGTACCAGGA GGTGAAGCCC AACCCGCTGC AGGACGCGAA CCTCTGCTCA 60 

CGCGTGTTCT TCTCGTGGCT CAATOCCTTG TTTAAAATTG GCCATAAACG GAGATTAGAG 120 

GAAQATGATA TGTATTCAGT GCTGCCAGAA GACCGCTCAC AGCACCTTGG AGAGGAGTTG 180 

CAAGGGTTCT GGGATAAAGA AGTTTTAAGA GCTGAGAATG ACGCACAGAA GCCTTCTTTA 240 

ACAAGAGCAA TCATAAAGTG TTACTGGAAA TCTTATTTAG TTTTGGGAAT TTTTACGTTA 300 

ATTGAGGAAA GTGCCAAAGT AATCCAGCCC ATATTTTTCG GAAAAATTAT TAATTATTTT 360 

GAA AATTATG ATCCCATGGA TTCTGTGGCT TTCAACACAO CQT ACGCCTA TGCCACGGTO 420 

CTGACTTTTT GCACGCTCAT TTTGGCTAIA CTGCATCACT TATATTTTTA TCACQTTCAG 480 

TCTGCTGGGA TGAGGTTACG AGTAGCCATG TGCCATATGA TTTATCGGAA GGCACTTCGT 540 

CTTAGTAACA TGGCCATGGG GAAGACAACC ACAGGCCAGA TAGTCAATCT GCTGTCCAAT 600 

QATGTGAACA AGTTTGATCA GGTGACAGTG TTCTTACACT TCC T O T GQGC AGGACCACTQ 660 

CAGGCGATCG CAOTGACTGC CCTACTCTGG ATGGAGATAG GAATATCGTG CCTTOCT O GQ 720 

ATGGCAGTTC TAATCATTCT CCTGCCCTTG CAAAGCTGTT TTGGGAAGTT GTTCTCATCA 780 

CTGAGGAGTA AAACTGCAAC TTTCACGGAT GCCAGGATCA GGACCATGAA TGAAGTTATA 840 

ACTGGTATAA GGATAATAAA AATGTACGCC TGGGAAAAGT CATTTTCAAA TCTTATTACC 900 

AATTTGAGAA AGAAGGAGAT TTCCAAGATT CTGAGAAGTT CCTGCCTCAG GGGGATGAAT 960 

TTGGCTTCGT TTTTCAGTGC AAGCAAAATC ATCGTGTTTG TGAOCTTCAC CACCTACGTG 1020 

CTCCTCGGCA GTGTGATCAC AGCCAGCCGC GTGTTCGTGG CAGTGACGCT GfEATGGGGCT 1080 

GTGCGGCTGA CGGTTACCCT CTTCTTCCCC TCAGCCATTG AOAGGGTGTC AGAGGCAATC 1140 

GTCAGCATCC GAAGAATCCA GACCTTTTTG CTACTTGATG AGATATCACA GCGCAACCGT 1200 

CAGCTGCCGT CAGATGGTAA AAAGATGGTG CATGTGCAGG ATTTTACTGC TTTTTGGGAT 1260 

AAGGCATCAG AGACCCCAAC TCTACAAGGC CTTTCCTTTA CTGTCAGACC TGGCGAATTG 1320 

TTAGCTGTGG TCGGGCCCGT GGGAGCAGGG AAGTCATCAC TGTTAAGTGC CGTGCTCGGG 1380 

GAATTGGCCC CAAGTCACGG GCTGGTCAGC GTGCATGGAA GAATTGCCTA TGTQTCTCAG 1440 

CAGCCCTGGG TGTTCTCGGG AACTCTGAGG AGTAATATTT TATTTGGGAA GAAATACGAA 1500 

AAGGAACGAT ATGAAAAAGT CATAAAGGCT TGT G CTCTGA AAAAGGATTT ACAGCTGTTG 1560 

GAGGATGGTG ATCTGACTGT GATAGGAGAT CGGGGAACCA CGCTGAGTGG AGGGCAGAAA 1620 

GCACGGGTAA ACCTTGCAAG AGCAGTGTAT CAAGATGCTG ACATCTATCT CCTGGACGAT 1680 

CCTCTCAGTG CAGTAGATGC GGAAGTTAGC AGACACTTGT TCGAACTGTG TATTTGTCAA 1740 

ATTTTGCATG AGAAGATCAC AATTTTAGTG ACTCATCAGT TGCAGTACCT CAAAGCTGCA 1800 

AGTCAGATTC TGATATTGAA AGATGGTAAA ATGGT6CAGA AGGGGACTTA CACTGAGTTC 1860 

CTAAAATCTO GTATAGATTT TGGCTCCCTT TTAAAGAAGG AXAATGAGGA AAGTGAACAA 1920 

OCTCCAGTTC CAGGAACTCC CACACTAAGG AATCGTACCT TCTCAGAGTC TTCGGTTTGG 1980 

TCTCAACAAT CTTCTAGACC CTCCTTGAAA GATGGTGCTC TGGAGAGCCA AGATACAGAG 2040 

AATGTCCCAG TTACACTATC AGAGGAGAAC CGTTCTGAAG GAAAAGTTGG TTTTCAGGCC 2100 

TATAAGAATT ACTTCAGAGC TGGTGCTCAC TGGATTGTCT TCATTTTCCT TATTCTCCTA 2160 

AACACTGCAG CTCAGGTTGC CTATGTGCTT CAAGATTGGT GGCTTTCATA CTGGGCAAAC 2220 

AAACAAAGTA TGCTAAATGT CACTGTAAAT GGAGGAGGAA ATGTAAOCGA GAAGCTAGAT 2280 

CTTAACTGGT ACTTAGGAAT TTATTCAGGT TTAACTGTAG CTACCGTTCT TTTTGGCATA 2340 

GCAAGATCTC TATTGGTATT CTACGTCCTT GTTAACTCTT CACAAACTTT GCACAACAAA 2400 

ATGTTTGAGT CAATTCTGAA AGCTCCGGTA TTATTCTTTG ATAGAAATCC AATAGGAAGA 2460 

ATTTTAAATC GTTTCTCCAA AGACATTGGA CACTTGGATG ATTTGCTGCC GCTGACOTTT 2520 

TTAGATTTCA TCCAGACATT GCTACAAGTG GTTGGTGTGG TCTCTGTGGC TGTGGCCGTG 2580 

AT TCCTTGG A TCGCAATACC CTTGGTTCCC CTTGGAATCA TTTTCATTTT TCTTCGGCGA 2640 

TATTTTTTGG AAACGTCAAG AGATGTGAAG CGCCTGGAAT CTACAACTCG GAGTCCAGTG 2700 

TTTTCCCACT TGTC ATCTTC TCTCCAGGGG CTCTGGACCA TOCGGGCATA CAAAGCAGAA. 2760 

GAGAGGTGTC AGGAACTGTT TGATGCACAC CAGGATTTAC ATTCAGAGGC TTGGTTCTTG 2820 

TTTTTGACAA CGTCCCGCTG GTTCGCCGTC OGTCTGGATG CCATCTGTGC CATGTTTGTC 28B0 

ATCATCGTTG CCTTTGGGTC CCTGATTCTG GCAAAAACTC TGGATGCCGG GCAGGTTGGT 2940 

TTGGCACTGT CCTAXGCCCT CACGCTCATG GGGATGTTTC AGTGGTGTGT TOGACAAAGT 3000 

GCTGAAGTTG AGAATATGAT GATCTCAGTA GAAAGGGTCA TTGAATACAC AGACCTTGAA 3060 

AAAGAAGCAC CTTGGGAATA TCAGAAACGC CCACCACCAG CCTGGCCCCA TGAAGGAGTG 3120 

ATAATCTTTG ACAATGTGAA CTTCATGTAC AGTCCAGGTG GGCCTCTGGT ACTGAAGCAT 3180 

CTGACAGCAC TCATTAAATC ACAAGAAAAG GTTGGCATTG TGGGAAGAAC CGGAGCTGGA 3240 

AAAAGTTCCC TCATCTCAGC CCTTTTTAGA TTGTCAGAAC CCGAAGGTAA AATTTGGATT 3300 

GATAAGATCT TGACAACTGA AATTGGACTT CACGATTTAA GGAAGAAAAT GTCAATCATA 3360 

CCTCAGGAAC CTGTTTTGTT CACTGGAACA ATGAGGAAAA ACCTGGATCC CTTTAATGAG 3420 

CACACGGATG AGGAACTGTG GAATGCCTTA CAAGAGGTAC AACTTAAAGA AACCATTGAA 3480 

GATCTTCCTG GTAAAATGGA TACTGAATTA GCAGAASCAG GATCCAATTT TAGTGTTGGA 3540 

CAAAGACAAC TGGTGTGCCT TGCCAGGGCA ATTCTCAGGA AAAATCAGAT ATTGATTATT 3600 

GATGAAGCGA CGGCAAATGT GGATCCAAGA ACTGATGAGT TAATACAAAA AAAAATCCGG 3660 

GAGAAATTTG CCCACTGCAC CGTGCTAACC ATTGCACACA GATTGAACAC CATTATTGAC 3720 

AGGGACAAGA TAATGGTTTT AGATTCAGGA AGACTGAAAG AATATGATGA GCCGTATGTT 3780 

TTGCTGCAAA ATAAAGAGAG CCTATTTTAC AAGATGGTGC AACAACTGGG CAAGGCAGAA 3840 

GCCGCTGCCC TCACTGAAAC AGCAAAACAG GTATACTTCA AAAGAAATTA TCCACATATT 3900 

GGTCACACTG ACCACATGGT TACAAACACT TCCAATGGAC AGCCCTCGAC CTTAACTATT 3960 
TTCGAGACAG CACTGTGA 



316 



WO 02/30268 



PCT/US01/32045 



10 
15 
20 
25 



SEP ID NO:42 PBH5 PROTEIN SEQUENCE 

Protein Accession K NPJ0O5836 

c 1 11 21 31 41 51 

5 I | | III 

MLPVYQEVKP NPLQDANLCS KVFFWWLNFL FKIGHKRRLE BDDHYSVLPE DRSQHLGSBL 60 

QGFWDKEVLR AEMDAQKPSL TKAIIKCYWK SYLVLQIFTL IEESAKVIQP XFLGKZZNYF 120 

ENYDPMDSVA LNTAYAYATV LTFCTLILAI LHHLYPYHVQ CAQMRLRVAM CHMXYRKALR 180 

LSNKAMGXTT TGQIVNLLSN DVNKFDQVTV FLHFLWAGPL QAIAVTALLW MBIGISCLAG 240 

MAVLIILLPL QSCFGKLFSS LRSKTATPTD ARIRTMNEVI TGIRIIKHYA WEKSFSNLIT 300 

NLRKKEISKI LRSSCLRGMN LASFFSASKI JVFVTFTTYV LLGSVITASR VFVAVTLYGA 360 

VRLTVTLPFP SAIERVSEAI VSIRRIQTFL LLDEISQHNR QLPSDGKXMV HVQDFTAFWD 420 

KASETPTLQG LSFTVRP6EL LAWGPVGAG KSSLLSAVLG ELAPSHGLVS VHGRIAYVSQ 480 

QPWVFSGTLR SNILFGKKYB KERYEKVTKA CALKKDLQLL KDGDLTVIGD RGTTLS6GQK 540 

ARVNLARAVY QDADIYLLDD PLSAVDAEVS RHLFELCICQ ILHEKITILV THQLQYLKAA 600 

SQILILKDGK MVQKGTYTEF LKSGZDFGSL LKKDNEESEQ PPVPGTPTLR NRTFSKSSVW 660 

SQQSSRPSLK DGALESQDTE NVPVTLSEEN RSBGKVGFQA YKNYFRAGAH WIVFIFLILL 720 

NTAAQVAWL QDWWLSVWAN KQSMLNVTVN GGOIVTEKLD LNWYLGTYSG LTVATVLFGI 780 

ARSLLVFYVL VNSSQTLHNK MFESILKAPV LFFDRNPIGR ILNRFSKMG HLDDLLFLTF 840 

LDFI0TLLQ7 VGWSVAVAV IPWIAIPLVP LOIIFIFLRR YFLETSRDVK RLESTTRSPV 900 

FSELSSSLQG LWTIRAYKAE ERCQELFDAH QDLHSEAWFL FLTTSKWFAV RLDAICAMFV 960 

IIVAFGSLIL AKTLDAGQVG LALSYALTLM GMFQWCVRQS AEVENMMISV ERVXEYTDLB 1020 

KEAPWEYQKR PPPAWPHEGV ItFDWVNFMY SPGOPLVLKH LTALIKSQEK VGIVGRTGAG 1080 

KSSLISALFR LSEPEGKIWI DKILTTEIGL HDLRKKMSII PQEPVLFTGT KRKNLDPFNE 1140 

HTDBELWNAL QEVQLKETIE DLPGKMDTEL ABSGSNFSVG GRQLVCLARA ILRKNQILII 1200 

DEATANVDPR TDKLIQKKIR BKFAHCTVLT IAHRLNTIID SDKIMVLDSG RLKEYDEPYV 1260 

LLQNKBSLFY KMVQQLGKAE AAALTETAKQ VYFKRNYPHI GHTDHMVTNT SNGQPSTLTI 1320 
FETAL 

30 SEQ ID N&43 PBQ7 0NA SEQUENCE 

Nucleic Add Accession* NMJD2I233 

' Coding sequence: 34-1 1 19 (undefined sequences correspond to start and slop colons) 
or 1 11 21 31 41 51 

35 | | | II | 

ATGGGGAAAG TGTCCTGCTG TGGCATGAAA TAAATGAAAC AGAAAATGAT GGCAAGACTG 60 

CTAAGAACAT CCTTTGCTTT GCTCTTCCTT GGCCTCTTTG GGGTGCTGGG GGCAGCAACA 120 

ATTTCATGCA GAAATGAAGA AGGGAAAGCT GTGGACTGGT TTACTTTTTA TAAGTTACCT 180 

AAAAGACAAA ACAAGGAAAG TGGAGAGACT GGGTTAGAGT ACCTGTAOCT AGACTCTACA 240 

ACTAGAAGCT GGAGGAAGAG TGAGCAACTA ATGAATGACA CCAAGAGTGT TTTGGGAAGG 300 

ACATTACAAC AGCTATATGA AGCATATGCC TCTAAGAGTA ACAACACAGC CTATCTAATA 360 

TACAATGATG GAGTCCCTAA ACCTGTGAAT TACAGTAGAA AGTATGGACA CACCAAAGGT 420 

TTACTGCTGT GGAACAGAGT TCAAGGGTTC TGGCTGATTC ATTCCATCCC TCAGTTTCCT 480 

CCAATTCCGG AAGAAGGCTA TGATTATCCA CCCACAGGGA GAOGAAATGG ACAAAGTGGC 540 

ATCTGCATAA CTTTCAAGTA CAACCAGTAT GAGGCAATAG ATTCTCAGCT CTTGGTCTGC 600 

AACCCCAACG TCTATAGCTG CTCCATCOCA GCCACCTTTC ACCAGGAGCT CATTCACATQ 660 

CCCCAGCTGT GCACCAGGGC CAGCTCATCA GAGATTCCTG GCAGGCTCCT CACCACACTT 720 

CAGTCGGCCC AGGGACAAAA ATTCCTCCAT TTTGCAAAGT CGGATTCTTT TCTTGACGAC 780 

ATCTTTGCAG CCTGGATGGC TCAACGGCTO AAGACACACT TGTTAACAGA AACCTGGCAG 840 

CGAAAAAGAC AAGAGCTTCC TTCAAACTGC TCCCTT C C T T ACCATGTCTA CAATATAAAA 900 

GCAATTAAAT TATCACGACA CTCTTATTTC AGTTCTTATC AAGATCACGC CAAGTGGTGT 960 

ATTTCCCAAA AGGGCACCAA AAATCGCTGG ACATGTATTG GAGACCTAAA TCGGAGTCCA 1020 

CACCAAGCCT TCAGAAGTGG AGGATTCATT TGTACCCAGA ATTGGCAAAT TTACCAAGGA 1080 
TTTCAAGGAT TAGTATTATA CTATGAAAGC TGTAAGTAAA CTTGGTGAAA GGACACAGGT 
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SB0IDN&44£ 

ProUnAccessionf: NPJ67056 



,~ 1 11 21 31 41 51 

60 | | | | | | 

MMARLLRTSF ALLFLGLFGV LGAATISCRN EEGKAVDWFT FYKLPKRQNK ESGBTGLEYL 60 

YLDSTTR5WR KSEQLMNDTK SVLGRTLQQL YEAYASKSNN TAYLIYNDGV PKFVNYSRKY 120 

GHTKGLLLWN KVQGFWLIHS IPQFPPIPEE GYDYPPTGRR KGQSGICITF KYNQYEAIDS 180 

QLLVCNPNVY SCSIPATFHQ ELZBMPQLCT RASSSEIPGR LLTTLQSAQG QRFLHFAKSD 240 

SPLDDIFAAW HAQRLKIHLL TETWQRKRQE LPSNCSLPYH VYNIKAXKLS BHSYFSSYQD 300 
HAKWOSQKG TKNRWTCZGD LNRSPHQAFR SGGFICTQNW QIYQAPQGLV LYYESCK 



SEO ID KO:45 PCQ8 DNA SEQUENCE 

Nudeic Acid Accession t. XM_030453 
70 Coding sequence: 89-1273 (undeiflned sequences correspond to start and stop codons) 



1 U 21 31 41 51 

I I I I I I 

CGGTGCCCTG GGGTGGAATA TOCCCTACGA ATTTAACCAA GCGGACTTTA ATGCCACTGT 60 

GCAGTTCATC CAAAACCACT TGGATGACAT GGATGTCAAA AAGGGTGTCT CCTGGACCAC 120 

CATCCGCTAC ATGATAGGAG AGATTCAATA TGGAGGCAGA GTCACTGAOG ACTATGATAA 180 

GAGATTGTTG AACACATTTG CTAAGGTTTG GTTCAGTGAA AATATGTTTG GACCAGATTT 240 

CAOTTTTTAC CAAGGATACA ATATTCCAAA ATGCAGCACA GTGGATAACT ATCTTCAGTA 300 

TATCCAGAGT lUXaCCTGCCT ATGACAGCCC TGAGGTGTTT GGGCTGCACC CCAATGCTGA 360 
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CATCACCTAC CAGAGCAAGC TGGCCAAGGA CGT6CT0QAC ACCATCCTAG GCATCCAACC 420 

CAAGGACACC TCTGGTGGAG GGGATGAGAC CCG6GA66C6 CT GG T GG CCC GGCTGGCTGA 480 

TGATATGCTG GAGAAGCTGC CCCCAGACTA TGTCCCCTTT GAAGTAAAAO AGAGGCTGCA 540 

GAAGATGGGG CCATTCCAGC CEATGAACAT TTTCCTCAGG CAGGAAATAG ACAGAATGCA €00 

AAGGGTACTC AGCCTTGTCC GCAGCACCCT CACTGAGCTG AA ACTTG CTA TTGATGGCAC 660 

CATCATCATO AGCGAAAATC TGCAAGATGC ATTGGATTGC ATGTTTGATG CTAGAATCCC 720 

TGCTTGGTGG AAAAAAGCTT CTTGGGTTTT TAGTACACTG GGTTTCTGGT TTACTGAACT 780 

TATAGAAAGA AACAGCCAGT TTACCTCGTG GGTTTTCAAT GGCOGACCTC ACTGCTTTTG 840 

GATGACGGGT TTTTTTAACC CCCAGGGATT TTTAACTGCA ATGCGACAGG AAATAACTCG 900 

GGCCAACAAA GGCTGGGCTC TGGACAATAT GGTGCTTTGC AATGAAGTCA CCAAATGGAT 960- 

GAAGGACGAC ATTTCTACCC CTCCCACAGA GGGTGTCTAT GTCTATGGCT TATATCTTGA 1020 

AGGTGCTGGC TGGGACAAGA GGAACATGAA ACTCATTGAA TCAAAGCCAA AAGTGCTCTT 1080 

TGAGTTGATG CCTGTCATAA GGATTTATGC AGAAAACAAT ACTTTACGAG ATCCT CG GTT 1140 

TTACTCCTGT CCCATCTATA AGAAGCCAGT TCGAACGGAC TTGAACTACA TT6CCGCTGT 1200 

GGATCTCAGG ACAGCCCAGA CCCCTGAACA CTGGGTGCTC CGTGGGGTTG CCCTTCTGTG 1260 

TGATGTCAAG TAACATGTGG GGAGTGTCCC CACCCAATGC TTTGGAAAAT GCAAGATCTA 1320 

AATTATTGTA ACCTTTATTT CTGTATOACT GCTGGACAGT GTATGTTAGG TCGTTTATGC 1380 

AATTAATOAG CTGCATAGGT TTTCOOCACT CCTTAATTGG ATGCTTATAT TTTACTTGTT 1440 

TCATCATTAG TGACCAATGT CTGAGTTTGT TGAAAATGTT ATTTAGTGAT ATAAAAGTAA 1500 

ATTTACAGCA TCCTAATGAA GTGTGGCCCT CAAATCCACA GTAGTATATT TTCTTCTTAC 1560 

TTCGCTCCGA AGACTGACTG TGATTATAAC AGCAAATATA TTTGCATGTG GACAAAGATT 1620 

AGATGGCAAG ATAGAAAAAT AAGAACAGAT GTGATAGCAA GAATTATAGT TGGCTTGAAA 1680 

AAATGTGATG ATCAGGAGAA AAAATAAAAA AAGGGTAGAA ATATTAGACG GTGCGTAGGG 1740 

ACTTTCTATG GACTTTTATT AATTAGGAAA CATTATCAAA GGAACTTTTC ACGTATTTTT 1800 

CT1TAAATTC TGGTTAGATG TTATTAATAA TTCTTCATCT AACCTACTGA CTAGAAAATA 1860 

TAGTCAGTAC TAAATTAGAA TTCTGGTTTA TAAACTTTTG GTTAGCTCTG GATCTGTATA 1920 

ACTGCATTTT TTTGGATAAA CAGTTTTTGG TAGGTGGATA CCGGGAGACA AGTGTGGGTC 1980 

CCTCTCACTG GGCTTCATTC TGTGGACCAG GATCATTATT TCATGCTCAT GATCATGAGA 2040 

GTTAGGACTG AGTGGCTCCT GTGACTCCCA CCATCTTAGA TGATACTGTT TTCTTGTGAG 2100 

TTCTTTCTTT TGGTGTGGAT TAGTATATCA GTTGATTTGT GTGAATTGTG GTGAAACAAT 2160 

CATTTCATTT TGAAAAGCAA GTAATGAAAA TGTCAGCATC ATAGGAATTA ATAAAATGTT 2220 
TTTACTAAAA AAAAAAAAAA AAA 

SEQ 1DN0:4S PCQB Protean sequence 
Protein Accession #: BAB15543 



1 11 21 31 41 51 

I I I I I I 

MDVKKGVSWT TIRYMIGEIQ YGGKVTDDYD KRLLNTPAKV WFSENHFGFD FSFYQGYNIP 60 

KCSTVDNYLQ YIQSLPATOS PBVFGLHPNA DITYQSKLAK DVLDTILGIQ PKDTSGGGDE 120 

TREAWARLA DDMLEKLPPD YVPFEVKERL QKMGPFQFKN IFLRQBIDRM QRVLSLVRST 180 

LTELKLAXBG TIIMSENLQD ALDCMFDARI PAWWKKASWV FSTLGFWFTE LIERNSQFTS 240 

WVFNGRPHCF WHTGFFNPQG FLTAMRQEIT RAKKGWAU3N MVLCNEVTKW MKDDISTPPT 300 

EGVYVYGLYL EGAGWDKRNM KLIESKFKVL FELMFVTRIY AENNTLRDPR FYSCPIYKKP 360 
VRTDLNYIAA VDLRTAQTPE HWVLRGVALL CDVK 

SEQ ID 110:47 TOGS DNA SEQUENCE 

Nucleic AcW Accession #: AB033036 

Coding sequence: 634349 (underlined sequences correspond to start end step codons) 

1 11 21 31 41 51 

I I I I I I 

GGAGCAGCCT ACAACTTCAC AACCAGAAAC CACTACCCCT CAGGGGTTGC TTTCAGATAA 60 

AGATGACATQ GGAAGGAGAA ATGCTGGCAT AGATTTCGGA TCCAGAAAAG CATCAGCAGC 120 

ACAGCCCATA CCTGAAAACA TGGACAATTC CATGGTTAGT GATCCACAAC CATACCATGA , 180 

AGATGCAGCT TCTGGAGCTG AGAAGACAGA AGCCAGAGCT TCTCTCTCAC TGATGGTGGA 240 

AAGCCTTTCT ACAACCCAAG AGGAGGCCAT TCTCTCAGTA GCAGCAGAGG CTCAGGTGTT 300 

TATGAATCCT TCTCATATCC AGTTAGAAGA TCAAGAAGCT TTCAGCTTTG ATTTACAAAA 360 

GGCCCAATCC AAAATGGAGT CAGCOCAGGA TGTTCAAACT ATCTGCAAAG AAAAGCCTTC 420 

TGGAAATGTT CACCAGACCT TTACAGCAAG TGTTTTGGGT ATGACAAGTA CTACAGCCAA 480 

AGGAGATGTT TATGCCAAGA CTCTGCCTCC CAGAAGCCTT TTTCAGTCCT CAAGGAAGCC 540 

TGATGCTGAA GAAGTCTCCT CAGA3TCAGA GAATATTCCT GAGGAGGGGG ATGGTTCTGA 600 

AGAACTGGCT CATGGTCACT CTTOCCAGTC CTTGGGGAAG TTTGAAGATG AACAAGAAGT 660 

CTTCTCAGAA TCAAAAAGTT TTGTTGAGGA CTTGAGCAGC TCTGAGGAGG AGCTGGACCT 720 

CAGATGCCTC TCCCAGGCTT TAGAGGAGCC TGAAGATGCA GAAGTCTTCA CAGAATCAAG 780 

CAGTTATGTT GAAAAOTACA ACACTTCTQA TQATTGCAGC AGCTCAGAGG AAGACCTGCC 840 

TCTCAGACAC CCTGCTCAGG CCTTGGGAAA GCCCAAAAAC CAACAAGAAG TCTCCTCTGC 900 

TTCAAATAAT ACTCCTGAAG AGCAGAATGA TTTTATGCAQ CAGCTGCCTT CCAGATGCCC 960 

TTCTCAGCCC ATTATGAATC CTACTG TTCA GCAACAAGTC COCACCAGTT CAGTGGGCAC 1020 

TTCTATAAAA CAGAGCGATT COGTGGAGCC AATCCCTCCA AGACACOCTT TCCAGCCATG 1080 

GGTGAACCCT AAAGTGGAGC AAGAAGTTTC CTCATCTCCA AAGAGCATGG CTGTTGAAGA 1140 

GAGCATTTCT ATGAAGCCTC TGCCTCCTAA ACTTCTTTGC CAGCCCTTGA TGAATCCTAA 1200 

AGTTCAACAA AACATOTTCT CAGGTTCAGA GGACATTGCT GTTGAGAGAG TCATTTCTGT 1260 

GGAGCCACTA CTCCCCAGAT ATTCTCCTCA GTCCTTGACA GATCCTCAAA TCCGGCAAAT 1320 

CTCAGA AAGC ACAGCTGTTG AGGAAGGCAC TTATGTGGAA CCGCTGCCTC CCAGATGCCT 1380 

TTCCCAGCCC TCGGAGAGGC CTAAGTTCCT GGACTCAATG AGTACTTCTG CAOAATGGAG 1440 

CAGTCCTGTG GCACCAACAC CTTCCAAATA CACTTCCCCG CCATGGGTGA CCCCTAAATT 1500 

TGAGGAACTG TATCAACTCT CTGCACATCC AGAAAGCACT ACTGTTGAAG AGGACATTTC 1560 

TAAGGAGCAG CTGCTTOOCA GACATCTTTC OCAGTTGACT GTGGGAAATA AAGTCCAGCA 1620 

ACTGTCCTCA AATTTCGAGC GGGCTGCTAT TGAGGCAGAC ATTTCTGGGA GTCCATTGCC 1680 
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TOCCCAATAT GCTACCCAGT TCTTAAAGAG GTCTAAAGTT CAGGAAATGA CCTCACGACT 1740 

AGAOAAAATO 6CTGTTGAAG GCACTTCTAA CAAATCACCO A3TCCCAGGC GTCCGACCCA 1800 

GTCATTCQTO AAATTTATGG CACA6CAAAT CTTTTCAGAO AGCTCT GC TC TTAAGAGGGQ 1860 

CAGTGATGTG GCACCTCTGC CTCCCAATCT TCCTTCCAAA TCTTTATCAA AGCCTGAAGT 1920 

CAAGCACCAA GTTTTCTCAG ATTCA GGQAO TGCT AATOCT AAGGGAGGCA TTTCTTCAAA 1980 

GATGCTACCT ATGAAGCACC CTTTACAOTC CTT6GQGAG0 CCTGAAQAOC CACAGAAAGT 2040 

TTTCTCTTAT TCAGAGAGAG CTCCTGGGAA GTGCAGCAGT TTTAAAQAOC AgCTGTCTCC 2100 

CAGGCAGCTT TCCCAGGCCT TQAGGAAACC .TGAGTATOAG CAAAAAGTCT CCOCTQTTTC 2160 

TGCCAGTTCT CCTAAAGAGT GGAGGAATTC TAAAAAGCAQ CTGCCTCCCA AACATTCTTC 2220 

CCAAGCCTCA 6ATAG0TCTA AATTCCAGCC ACAGATGTCA TCAAAGGGCC CAOTGAATGT 2280 

ACCTGTAAAG CAGAGCAGCG GTGAGAAGCA CCTGOCTTCA AGTAGTCCTT TCCAGCAACA 2340 

GGTTCATTCA AGTTCTGTGA ATGCTOCTGC TAGGCGATCT GITTTTGAGA 6CAATTCT6A 2400 

CAATTGGTTC CTAGGAAGAG ATGAA6CTTT TCCAATCAAA ACCAAGAAAT TCAGCCAAGG 2460 

TTCCAAAAAC COCATAAAGA GCATTOCA6C CCCTGCTACC AAACCTGGGA AGTTCACCAT 2520 

TGCTCCTQTC AGGCAAACAT CCACTTCTGG GGGCATTTAC TCTAAGAAAG AAGATCTTGA 2580 

GAGTGGTGAT GGTAATAATA AOCAGCATGC AAACCTATCC AATCAGGATG ATOTTGAAAA 2640 

GCTTTTTGGA GTTCGACTGA AAAGAGCCOC TCCCTCGCAG AAGTATAAGA GTGAGAAACA 2700 

AGATAACTTC ACCCAGCTTG CTTCAGTGCC CTCGGGOCCA ATTTCATOCT CTGTAGGCAG 2760 

GGGACATAAA ATCAGAAGCA CTTCCCAGGG GCTCCTGGAT GCTGCAGGGA ACCTCACCAA 2820 

AATATCTTAC GTTGCAGATA AGCAACAGAG CAGGCCCAAA TCTGAAAGCA TGGCCAAGAA 2880 

GCAACCTGCT TGCAAGACCC CAGGAAAGCC TGCTGGTCAA CAGTCAGATT ATGCTGTCTC 2940 

AGAGCCGGTT TGGATAACTA TGGCAAAGCA GAAGCAGAAG AGTTTCAAGG CCCACATTTC 3000 

TGTGAAAGAG CTGAAAACTA AGAGCAATGC TGGAGCCGAT GCTGAGACTA AGGAGCCTAA 3060 

ATATGAGGGA GCTGGCTCTG CAAATGAAAA CCAACCTAAA AAGATGTTCA CTTCCAGTGT 3120 

CCAXAAACAG GAGAAGACAG CACAGATGAA GCCAOCTAAG CCKACAAAAT CAGTTGGATT 3180 

TGAAGCTCAG AAGATACTGC AAGTTOCTGC CATGGAAAAA GAAACCAAAC GATCTTCAAC 3240 

TCTCCCAGCC AAGTTCCAGA ACCGAGTTGA GCCAATTGAG OCT G TCTGGT TCTCACTGGC 3300 

CAGGAAGAAA GCCAAAGCAT GGAGCCACAT GGCAGAAATC ACGCAATAAA GAGCTCTTGT 3360 

GTCGAGCATC AGCA1TXATT TTATTTAGTT TXTTTTTTTT TTTTTTTTTT GAGACAGAGT 3420 

CTCGCTCTGT TACCCAGATT GGAGTGGAGT GGCGCGATCT CCGCTCACTG CAAGCTCCGC 3480 

CTCCGGGGTT CACGCCACTC TOCCGCCTCA GTCTCCCGAC TAGCTGGGAC TACAGGCGCC 3540 

CGCCATCACG COCGGCTAAT TTTGTTTTCG TATTTTTAGT AGAGACGGGG TTTCACCATG 3600 

TTGGCCAGGA TGGTCTTGAT CTCCTGACCT CGTGATCCGC CCGCCTCAGC CTCCCAAAAG 3660 

CTGGGATTAC AGGOGTGAGC CACCGCGCCC GGCCAAGCAT CAGOGTTTTA AATGATAATT 3720 

GCTAATAGCT GTATTAATTC TATGTAGTGA TCTTTTTACT GTGACCACTT GTATTAAGCA 3780 

AAATAAOTAT TAAGCAAACT AAGAATTTAT TAAGCAAAAT AAGAATTTAT TAAGCAAAAT 3840 

AGCCTTAOAA ATGCAAATTA AAACATAATT ATTTGAATGA AATAAATGOC ATGAATGCTT 3900 

AACCTTCCAC GTAGTCACTG OCAGCACCCA GAAACCCAGC ATTTCCTCTA TTAAAACTAT 3960 

CGAAAACATT TGCACTGCTG TAAAATTGCA AAATCTTTAA CTTTGGACAA TGTGCTTTAG 4020 

AAGGGAGAAA GCAAAAACAT TTTGTTGGAG CAACTAGAAA ATTGTCATTT C CCTCA ACCA 4080 

AATAAAGTAA TTCTAATGGA AACATTCAGA TGATTTGACC TAAAGATTGG CCTTTAGGTT 4140 

TTATGAGOCT AGATAGATGC CGCAATTATT TGGTTGTTGC TCTAAGCTTT GCAAGGGATC 4200 

CZAAAAGAGG CGGTGGAAGT GAAAATTCTG GGTCTCCAAG AAAATTTCTG CACAGCCAGT 4260 

TCTCCAATCA GCCTATCACC OCTTGAAACA TCTTCCCTGT GTOOCTGGGG GCCOCTGATG 4320 

C CTT C TCCTT GGGTGATAGT AACATGCAGA GCACTTACAC AAAGCTCCCT CTTTGGACAT 4380 

ACCCCACGTC GACCTGTCAC AGGCCTGGCT GTAGCGAGCA CCTCCCTATG ACGCAGAATG 4440 

CTTCTTGGGA ATTATCTTAC TCCTCTGGAG GGTTAGTCCA TCAATGTTTT GCTICTTGTC 4500 

CCAATACTAC TGTGACOCTC TCTGATCGCA CAGAAATCAC TGOCTATCAC ATATATCCTG 4560 

TZAAGCACTG AAGAOCCTAT TGAAATTAGA GTTCTACAGA TGOCAAAAGC TGTACTTTCC 4620 

ATCAGGCAGA TGGCAAGCTT ACTGCCTTGA TGCACATCTG GAGGCACTGG AGCTCCTTCC 4680 

TCTCTGGTTC CAGCATTAAG GTGGAGAACT CCATGTAGCT TCTTGTCCTT TCCCCTCAGC 4740 

TGTCTTTGCT TCACAAGGTT TTAGCCCAAA GCAAGAGTGC AATCCCAAAG CCACAGAGAA 4800 

ATGAACTTTC CGCTACCTGG AAGCTTTAAG TGAGTAAATC AGC TOTTOO C CTCTCATTCC 4860 

TAGAGGCACA CACCTCAAAA GTTACTAGGC T6GAGAGACC CTACCTTCCA GTGACCCACT 4920 

CATCCCCCAG OCACGGAGAA GAGGGAAGAC CAAAAAGGGA GAGTGAGAAA GAGGATGAGA 4980 

GGGATGGTCA GCTGTGAGGG GAGGGGGCAA GTGGCOCAGC AAATGTTGAT GCXrrCCCTTC 5040 

CCATCTTGOC ACAOGGTCTT TTTCTTTTGT AGCACAGCCT CCATTAAIAA CTC CTCQQC T 5100 

GAGGATGAAG ATGTAGGCAC CTTTAOCCCC AGAGCGAGTT CCTTAATTGG CTGGCTTTCT 5160 

GAGATGCAGA OCACOCTAGA ATCTCATCTA GGTTCACTAG AAGTTAGTTA AATCTTCCTT 5220 

TCICTU S K M TCTCTTCATT CCATCCCCCA AACCCACCAA ACACTAAGGG AGAGCTCCCT 5280 

TTGGATGTCT GGGCAGTAAA CCTAGCTCAT TTTTCTAGGA GACOCAGAAG TGACTTCTGA 5340 

GTAGTTATCA CTGTGTCTGC CTCTGTTACA CTGTGCTGCT TTGCTTAAAC AGAAATGCAG 5400 

GCCTGGACAT CTGACTGTGC CTTTATATTC TGAGTGGGG? GCTGCCCCAT GCAAAAAAAT 5460 

CCAGAGAGGT AGTGAGGTGT CAGAGCTAAA CACTTGGTGC TGGGTTTTGT TGATGCTGGT 5520 

ATAATGTGAC ACAGTACAAT TACATGCTAA ATTTTGCATT TTCTCTATAT AACATCTATT 5580 

TTTCCTGATA CTGTGCCTTT GCCATTTTGA TAATGCTATT TTGATTGAGT GAATTTTATT 5640 

TCCTTTGTAT TCCCATAGTG AACAATATAT TAAGGTAGAT GCCCTTTATC TGGGTACTCC 5700 

TGGTAGATTA GCT GTTACAC CTCOCTICCC TTTTTTA CAQ TGAACCTGTA TTCAGTTATT 5760 

GTCACTCTGA GAACTCTOCA ATAACAATTT CTTTTOCACA GTTAACAACA CAGCTGTTAC 5820 

ACCTCCCTTC CTTTTTTTCA CAGTGAACCT GTATTCAGCT ATTCTCACTC TGAGAACTCT 5880 

CCAATAACAA TTTCTTTTCC ACAGTTAACA ACAAAGTTCT GTTTTTAAAT GAAGAGATTA 5940 

AGTTCTTTTT AAATGCCTAA AGGCATATTC TGACAACTTT TCTACTTCTT TAACTTTTTT 6000 
GATTTAAGAT ATATGCAAAG CAAATAAATT CAATAAAGCC T 

KFO m NeHA pnGS Protein spam** 

Proteto Accession* BAA86524 

1 11 21 31 41 51 

I I I I I I 

EQPTTSQPET TTPQGLLSDK DDHGRRKAGZ DFGSHKASAA QPIPENMDNS MVSDPQPYHE 60 
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DAASGAEKTE 


ARASLSLMVE 


SLSTTQBEAI 


LSVAAEAQVF 


MJPSHIQLED 


QEAFSFDLQK 


120 


AQSKMESAQD 


VQTICXEKPS 


GNVHQTFTAS 


VLGMTSTTAK 


GDVYAKTLPP 


RSLFQSSRKP 


180 


DAEEVSSDSE 


NIPEEGDGSB 


ELAHGHSSQS 


LGKFEDEQEV 


FSESKSFVED 


LSSSEEELDL 


240 


RCLSQALBEP 


EDAEVFTESS 


SYVEKYWTSD 


DCSSSEEDLP 


LRHPAQALGK 


PKNQQEVSSA 


300 


SNNTFRBQND 


FHQQLPSRCP 


SQPDJNPTVQ 


QQVPTSSVGT 


SIKQSDSVBP 


IPPRHPFQPW 


360 


VNPKVEQEVS 


SSPKSMAVES 


SISHKPLPPK 


LLCQPLMNPK 


VQQNMF5GSE 


DIAVERVTSV 


420 


EPLLPRYSPQ 


SLTDPQIRQI 


SESTAVEEGT 


YVBPLPPRCL 


SQPSERPKPL 


DSMSTSABWS 


480 


SPVAPTPSKY 


TSPPWVTPKF 


EELYQLSAHP 


ESTTVEEDIS 


KEQLLPRHLS 


QLTVGNKVQQ 


540 


LSSNFERAAI 


BADIS6SPLP 


PQYATQFLKR 


SKVQEMTSRL 


EKMAVEGTSN 


KSPIPRRPTQ 


600 


SFVKFHAQQI 


FSESSALKRG 


SUVAPLPFNL 


PSKSLSKPEV 


KHQVFSDSG5 


ANPKGGISSK 


660 


MLPMKHPLQS 


L6RPEDPQKV 


FSV5ERAP6K 


CSSFKEQLSP 


RQLSOALRKP 


EYEQKVSPVS 


720 


assfkewhns 


KKQLPPKHSS 


QASDRSKPQP 


QMSSKGPVNV 


PVKQSSGEKH 


LPSSSPFQQQ 


780 


VHSSSVNAAA 


RRSVFESNSD 


NWFLGRDEAF 


AXKTXXFSQG 


SKNPIKSIPA 


PATKPGKFTI 


840 


APVRQTSTSO 


GIYSKKEDLB 


SGDGNNNQHA 


NLSNQDDVEK 


LFGVRLKRAP 


PSQKYKSEKQ 


900 


ONFTQLASVP 


SGPISSSVGH 


GHKXRSTSGG 


LLDAAGNLTK 


ISYVADKQQS 


RPKSESMAKK 


960 


QPACXTPGKP 


AGQQSDYAVS 


EPVWITMAKQ 


KQKSFKAHIS 


VKELKTKSNA 


GABAETKEFK 


1020 


YEGAGSANEN 


QPKKHFTSSV 




PPKPTKSVGF 


EAQKILQVPA 


MEKETKRSST 


1080 


LPAKFQNPVE 


PIEPVWPSLA 


RKKAKAWSHM 


AEITQ 









8EQ D KO:49 PAB7 ONA SEQUENCE 

Nucleic Add Accession t: D87742 

Coding sequence: 208-3582 (underlined sequences correspond to start end stop codons) 

1 11 21 31 41 51 

I I I I I I 

GCTTTCCTTT CTAAAGTAGA AGAGGATGAT TATCCCTCTG AAGAACTACT AGAGGATGAA 60 

AACGCTATAA ATGCAAAACG GTCTAAAGAA AAAAACCCTG GGAATCAGGG CAGGCAGTTT 120 

GATGTTAATC TGCAAGTCCC TGACAGAGCA GTTTTAGGGA CCATTCATCC AGATCCAGAA 180 

ATTGAAGAAA GCAAGCAAGA AACTAGTATG ATTTTGGATA GTOAAAAAAC AAGTGAGACT 240 

GCTGCCAAAG GGGTCAACAC AGGAGGCAGG GAACCAAATA CAATGGTGGA AAAAGAACGC 300 

CCTCTGGCAG ATAAGAAAGC ACAGAGACCA TTTGAACGAA GTGACTTTTC TGACAGCATA 360 

AAAATTCAGA CTCCAGAATT AGGTGAAGTG TTTCAGAATA AAGATTCTGA TTATCTGAAG 420 

AACGACAACC CTGAGGAACA TCTGAAGACC TCAGGGCTTG CAGGGGAGCC TGAGGGAGAA 480 

CTCTCAAAAG AGGACCATGG GAACACAGAG AAGTACATGG GCACAGAAAG GCAGGGGTCT 540 

GCTGCTGCAO AACCTGAAGA TGACTCQTTC CACTGGACTC CACATACAAG TGTAGAOCCA 600 

GGGCATAGTG ACAAGAGGGA GGACTTACTT ATCATAAGCA GCTTCTTTAA AGAACAACAG 660 

TCTTTGCAGC GGTTCCAGAA GTACTTTAAT GTCCATGAGC TGGAAGCCTT GCTACAAGAA 720 

ATGTCATCAA AACTGAAGTC AGCGCAGCAG GAGAGCCTCC CCTATAATAT GGAAAAAGTC 780 

CTAGATAAGG TCTTCCGTGC TTCTGAGTCA CAAATTCTGA GCATAGCAGA AAAAATGCTT 840 

GATACTCGTG TGGCTGAAAA TAGAGATCTG GGAATGAACG AAAATAACAT ATTTGAAGAG 900 

GCTGCAGTGC TTGATGACAT TCAAGACCTC ATCTATTTTG TCAGGTACAA GCACTCCACA 960 

GCAGAGGAGA CAGCCACACT GGTGATGGCA CCACCTCTAG AGGAAGGCTT GGGTGGAGCA 1020 

ATGGAAGAGA TGCAACCACT GCATGAAGAT AATTTCTCAC GAGAGAAGAC AGCAGAACTT 1080 

AATQTGCAGG TTCCTGAAGA ACCCACCCAC TTGGACCAAC GTOTGATTGO QGACACTCAT 1140 

GCCTCAGAAG TCTCACAGAA GCCAAATACT GAGAAAGACC TGGACOCAGG GCCAffPTACA 1200 

ACAGAAGACA CTOCTATGGA TGCTATTGAT GCAAACAAGC AACCAGAGAC AGCCGCOGAA 1260 

GAGCCGGCAA GTGTCACACC TTTGGAAAAC GCAATCCTTC TAATATATTC ATTCATGTTT 1320 

TATTTAACTA AGTCGCTAGT TGCTACATTG CCTGATGATG TTCAGCCTGG GCCTGATTTT 1380 

TATGGACTGC CATGGAAACC TGTA3TTATC ACTGCCTTCT TGGGAATTGC TTCGTTTGCC 1440 

ATTTTCTTAT GGAGAACTGT CC TTGT TO TO AAGGATAGAG TAIATCAAGT CACGGAACAG 1500 

CAAATTTCTG AGAAGTTGAA GACTATCATQ AAAGAAAATA CAGAACTTGT ACAAAAATTG 1560 

TCAAATTATG AACAGAAGAT CAAGGAATCA AAGAAACATG TTCAGGAAAC CAGGAAACAA 1620 

AATATGATTC TCTCTGATGA AGCAATTAAA TATAAGGATA AAATCAAGAC ACTTGAAAAA 1680 

AATCAGGAAA TTCTGGATGA CACAGCTAAA AATCTTCGTG TTATGCTAGA ATCTGAGAGA 1740 

GAACAGAATG TCAAGAATCA GGACTTGATA TCAGAAAACA AGAAATCTAT AGAGAAGTTA 1800 

AAGGATGTTA TTTCAATGAA TGCCTCAGAA TTTTCAGAGG TTCAGATTGC ACTTAATGAA 1860 

GCTAAGCTTA GTGAAGAGAA GGTGAAGTCT GAATGCCATC GGGTTCAAGA AGAAAATGCT 1920 

AGGCTTAAGA AGAAAAAAGA GCAGTTGCAQ CAGGAAATCG AAGACTGGAG TAAATTACAT 1980 

GCTGAGCTCA GTGAGCAAAT CAAATCATTT GAGAAGTCTC AGAAAGATTT GGAAOTAGCT 2040 

CTTACTCACA AGGATGATAA TATTAATGCT TTGACTAACT GCATTACACA GTTGAATCTG 2100 

TTAGAGTGTG AATCTGAATC TGAGGGTCAA AATAAAGGTG GAAATGATTC AGATGAATTA 2160 

GCAAATGGAG •AAGTGGGAGG TGACCGGAAT GAGAAGATGA AAAATCAAAT TAAGCAGATG 2220 

ATGGATGTCT CTCGGACACA GACTGCAATA TCGGTAGTTG AAGAGGATCT AAAGCTTTTA 2280 

CAGCTTAAGC TAAGAGCGTC CGTGTCCACT AAATGTAACC TGGAAGACCA GGTAAAGAAA 2340 

TTGGAAGATG ACCGCAACTC ACTACAAGCT GCCAAAGCTG GACTGGAAGA TGAATGCAAA 2400 

ACCTTGAGGC AGAAAGTGGA GATTCTGAAT GAGCTCTATC AGCAGAAGGA GATGGCTTTG 2460 

CAAAAGAAAC TGAGTCAAGA AGAGTATGAA CGGCAAGAAA GAGAGCACAG GCTGTCAGCT 2520 

GCAGATGAAA AGGCAGTTTC GGCTGCAGAG GAAGTAAAAA CTTACAAGOG GAGAATTGAA 2580 

GAAATGGAGG ATGAATTACA GAAGACAGAG CGGTCATTTA AAAACCAGAT CGCTACCCAT 2640 

GAGAAGAAAG CTCATGAAAA CTGGCTCAAA GCTCGTGCTQ CAGAAAGAGC TATAGCTGAA 2700 

GAGAAAAGGG AAGCTGCCAA TTTGAGACAC AAATTATTAG AATTAACACA AAAGATGGCA 2760 

ATGCTGCAAG AAGAACCTGT GATTGTAAAA CCAATGCCAG GAAAACCAAA TACACAAAAC 2820 

CCTCCACGGA GAGGTCCTCT GAGCCAGAAT GGCTCTTTTG GCCCATCCCC TGTGAGTGOT 2880 

GGAGAATGCT COCCTCCATT GACAGTGGAG CCACCCGTGA GACCTCTCTC TGCTACTCTC 2940 

AATCGAAGAG ATATGCCTAG AAGTGAATTT GGATCAGTGG AOGGGCCTCT ACCTCATCCT 3000 

CGATGGTCAG CTGAGGCATC TGGGAAACCC TCTOCMCTC ATCCAGGATC TCGTACAGCT 3060 

ACCATGATGA ACAGCAGCTC AAGAGGCTCT TCCCCTACCA GGGTACTCGA TGAAGGCAAG 3120 

GTTAATATGG CTCCAAAAGG GCCCCCTCCT TTCCCAGGAG TCCCTCTCAT GAGCACCCCC 3180 

ATGGGAGGCC CTGTACCACC ACCCATTCGA TATGGACCAC CACCTCAGCT CTGCGGACCT 3240 

TTTGGGCCTC GGCCACTTOC TOCAOOCTTT GGCOCTGGTA TGCGTCCACC ACTAGGCTTA 3300 
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AGAGAATTTG CACCAGGCGT TCCACCAGGA AGACGGGACC TGCCTCTCCA OOCTCGGGGA 3360 

TTTTTAOCTQ GACACGCACC ATTTAGACCT TTAGGTTCAC TTGGCCCAAG AGAGTACTTT 3420 

ATTCCTGGTA CCCGATPACC ACCCCCAACC CATGGTCCCC AGGAATACCC ACCACCACCT 3480 

GCTGTAAGAG ACTTACTGCC GTCAGGCTCT AGAGATGAGC CTCCACCT6C CTCTCAGAGC 3540 

ACTAGCCAGQ ACTGTTCACA GGCTTTAAAA CAQAGCCC AT AAA ACTATGA CCTCTGAGGT 3600 

TTCATTGQAA AGAAAGTGTA CTGTGCATTA TCCATTACAG TAAAGGATTT CATTGGCTTC 3660 

A AAATCC AAA AGTTTATTTT AAAAGGTTTG TTGTTAGAAC TAAGCTGCCT TGGCAGTGTG 3720 

CATTTTTGAG CCAAACAATT CAAAAATGTC ATTTCTTCCC TAAATAAAAA TCACCTTTTA 3780 

AGCTAGAGOG TCCTTACAAC TTTGAAATGT GCAATAAA6A ATACCTGTGT TTTAGCTAAT 3840 

GTAGCATATQ TAATTGCAAA ATGATTTAGA ATQTCATGAA AAATATGAAC ATTTCCTGTG 3900 

GAAATGCTTT AA6AACATGT ATTTCCATTA TCCTATTTTT AGTGTACACC AGCTGAATAC 3960 

GGAGCAATGG TGTTTATAAG OGTTTTTTTA AACTATCTGG TCACAAAGAC TGTTACGCTA 4020 

AAAATGTTTA CTAAAAEATC ACTAAACTAT glUOCCId'i 1 GCTGAAGTTC TTTGTAGTAA 4080 

TAQCTCATAA AAATTTGTTT ATTAATATTT CCCAAOTGTC TGTTGACTCA TTGGACTGTT 4140 

ATGAGGCTTG TGCCATTTGG GGAACATGTA AACTCAGGCT CCCAGAACTG AAGATGGTGQ 4200 

CTGGTGGCAC ACTTCCCGCT GCTCCTCCGT CACCTGTGAA CTCTACAAGT GATGTCTTTT 4260 

TATTTCAAAG AAGTTTATTT CCCACTTGTA TAGCATTCAC ATSCTTTCTT TACGATCCTC 4320 

ATTGTCTATT TGAGAATGGT TTTCTGAGAG TGAGTTTACA TTAOTAGCAA GAGTTGTTTG 4380 

ACCT6AT6TT CCATTGTTTT TACCATTCCT GTAGAAAAAG G6T6CACAAC AGAAAAATGA 4440 

AAATGATGTG TCATGGCCAT AAAAGTATAG AAATCTTTAA AAATTTTAAA ATGTACAGTC 4500 

CCTTATCTAT CTTTCCCATT CCTTGCCACT GATTTTTGAG GAATATAATA AAAAGATTGG 4560 

AAGAGTATAA TGCCATGAGA AAGAATGATT TA6GACTCTG AGGGTTATAA CATGCCCTAG 4620 

GTCAGCAACC AAGGGTTGAA ATCAGTTCTG TTTTAGGGGG AAATGGGGGG GGCGACAGAT 4680 

ATTATTCCAA AATTAATATT AATTAATATT TAAACGTTGG TGTTTTTATT TAAAAATCAG 4740 

TAACTAACCA TCTGGAATTG CACCATACTT AAAGTCTTAT CCATTACTAC ACTGTCTTTA 4800 

AAACAATGTT TCTTTAAATA CTCTACAACG TTTCTAAQAA CGAACTTCAG ACATTTTAAT 4860 

TACAGTAATA ATAGCACTCC TTTTAAGGAG TTTCAGATCC ACACTAAAAC TAAAATCATA 4920 

AAAGGCTGAT ACTTTTGTTT GCTGCTAGGC TATATTCTTC CATTCTTTGA AGTCCTATGA 4980 

TGTAATATTT TTGAAACCTA GTGTATGTCT TGTCACTGTT GTGATATTTA ATCGATTAAG 5040 

AATACCTTGT AAAAAGGAGC AAAAGCTTCA ATGTGAAACA ATTTTCTCTC TTTATACTAA 5100 

ACAACTGAAG ATAGATAGTT TAGAAAGATA AGGACCTTTG AAAGAAGACA ACTCTGTCAA 5160 

AGTTCATAAG GAATATAAAA ATTCTTCAGG AAAAGAGAAT TCAATCTATA TGTCCTCCCG 5220 

TTTAATATCA AGAATAGAAG AAATTAAGAG GAAAACTCCA CAGAAGAGCA TAGGCCACTT 5280 

TTAGCCATGT AAAAATAAGA TTAAGTCACA AATACAACTT TTGAATTTAC CTGTCAATAT 5340 

CTCTTTAGGA CACAAAACAA TGCTGAAGTT AATATAATTT CTAATTTTAA ATGTCATTTA 5400 

AGTGTAGATT ATGCCATCTA GGAAGGTAAG TAGGAAAGGT AAATTAAATC TATTTTTAAA 5460 

ATTCAAAATA TTAGAGTATT TTTCCCCTCT AAAGCCTTTT TTGGTGATTA TTCTGTATCT 5520 

GACATAATTG AGAAACTGGT AAGCTGTAAA GATTCCAGTG TAGCTTCTCT GAGAAGTTGT 5580 

GAGCCAGTCC ATAACTGCTT CCTCACATCC ATCTGATTGC ACCATTTCTG CAGCAAACCC 5640 

CAAAGCAGGG TGCCAATATG CAGATGGCAT AGGGAGTATC ATCCCTCAGC CAAATCACTT S700 

TTCCATCTCT AAAGTTTCAT CTATTTTGGA AGTCATCTCC AACTAATTGT GTCTGGATTT 5760 

AGTTGCTAAA ATTGTCTTAT TTATTTATGA AGCAGCAATA TTCAGCCTGA AAGCATTTCT 5820 

GCCATAGTTG TTGTAGTTAT ATCGCCAATG GCTGATTTTT TTCATTGGAA AGTAAATTTA 5880 

AGTAATTCGT GGGATGTGGT ATATTCTGTG TCAACTTCAA GATAATCACT CATTTTCTCG 5940 
TTAIATTCAG GTCTGAATTA AAGTTAAGTT AATCAC 



Proten Accession* BAA13448 



1 11 21 31 41 51 

I I I I I I 

AFLSKVEEDD YPSEELLEDE KAINAKRSKE KNPGNQGRQP DVNLQVFDRA VLGTIHFDFE 60 

IEESKQETSM ILDSEKTSET AAKGVNTGGR EPNTMVEKER PLADKKAQRP FERSDFSDSI 120 

KIQTPELGEV FQNKDSDYLK NDNPEEHLXT SGLAGEPEGE LSKEDHGMTB KYMSTESQGS 180 

AAAEPEDDSF HWTPHTSVEP GHSDKREDLL IISSFFKEQQ SLQRFQKYFN VHELEALLQE 240 

MSSKLKSAQQ ESLPYNMEKV LDKVFRASES QILSIAEKML DTRVAENRDL GMNENNIFEE 300 

AAVLDDIQDL IYFVRYKHST AEETATLVMA PPLEEGLGGA MEEMQPLHED NFSREXTAEL 360 

NVQVPEEPra LDQRVIGDTH ASEVSQKPNT EKDLDPGPVT TEDTPMDAID ANKQPETAAE 420 

EPASVTPLEN AILLIYSFMF YLTKSLVATL PDDVQPGPDP YGLPWKPVPI TAFLGIASFA 480 

IFLWRTVLW KDKVYQVTEQ QISEKLKTIM RENTE LVQKL SNYEQKTKES KKHVQETRKQ 540 

NMILSDEAIK YKDKIKTLEK NQEILDDTAK NLRVMLESER EQNVKNQDLI SEKKKSIEKL 600 

KDVISMNASE FSEVQIALNE AKLSEEKVKS ECHRVQEENA RLKKKKEQLQ QEIEDWSKLH 660 

AELSEQIKSF EKSQKDLBVA LTHKDDNINA LTNCITQLNL LECESE5EGQ NKGGNDSDEL 720 

ANGEVGGDRN EKMKNQIKQH MDVSRTQTAI SWEEDLKLL QUCLRASVST KCNLEDQVKK 780 

LEDDRNSLQA AKAGLEDECK TLRQKVEILN BLYQQKEMAL QKKLSQEEYE RQEREBRLSA 840 

ADEKAVSAAE EVKTYKRRIE EMEDELQKTE RSFKNQIATH EKKAHENWLK ARAAERAIAE 900 

EKREAANLRH KLLELTQKMA MLQEEPVTVK PMPGKPNTQN PPRRGPLSQN GSFGPSPVSG 960 

GECSPPLTVE PPVRPLSATL NRRDMPRSEF GSVDGPLPHP RWSAEASGKP SPSDPGSGTA 1020 

TMMNSSSRGS SPTRVLDEGK VNMAPKGPPP FPGVPLMSTP MGGPVPPPIR YGPPPQLCGP 1080 

FGPRPLPPPP GPGMRPPLGL REFAPGVPPG RRDLPLHPRG FLPGHAPFRP LGSLGPREYF 1140 
IPGTRLPPPT HGPQEYPPPP AVRDLLPSGS RDEPPPASQS TSQDCSQALK QSP 

SEQ ID NCh51 PAB9 DMA SEQUENCE 

Nadete Add Accession ft NMJW6457 

Coring sequence 84-1874 (underlined sequences eenespond to start and stop codons) 

1 11 21 31 41 51 

I I I I I 1 

A6ACTGAGGC GGAGGCAGOC CCGCGCCGCG CCGGACCCGA GCATATTTCA TTTTCTGTCA 60 
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5 
10 
15 

20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TTGGACTTTG 
CTTGGGGTTT 
TAAAAGATGG 
TTGATGGAAT 
GTACAGGCTC 
TTCCTGTTCA 
CTGTGTCCAA 
CTOC TO TOTC 
CCCATGCGAC 
TGTTCGCTGC 
CACT6A6CGC 
OTTCCGAGAC 
AACAGCAAAA 
TACCCACTCA 
CAAGAACTGG 
AACATTTGAA 
CTCCGCAGTT 
CAACCTCTGG 
GATCCACTGG 
CTGGAAGAAT 
TGGGACAAAC 
CAGGGAAACG 
TGGCACTGGG 
TGGCCTACAT 
AATTCTTTGC 
CGTTGAAACA 
GGAACAATGT 
TCTTTGGTAC 
AAGCTCTGGQ 



CTGTGAATTT 
AAATTAAAAT 
AQTGGCCCTG 
CATAAAGTAA 
AAATAAGCTT 
AGTGAAGAAT 
TGTTAGGTAG 
AACAGAA3TA 
TTAAACAGAG 
GGGCGGTGGC 
GAGGTCAGGA 
ACAAAAATTA 
GCACGAGAAT 
CACTCCAGCC 
TATTTTTGCC 
TGTGTCATGC 
GCTACCATAT 
TTCATTGTGT 
GTAAAGATTT 



AGCCATTAGA 
CCGGCTGCAG 
CGGCAAGGCA 
AAATGCACAA 
TTTGAATATG 
AAAGQGAGAA 
AGTCACTTOC 
TTCACCAAAA 
CACCTCATCA 
ATCTGGACT8 
TGGTAAAACT 
TTCTCAGGAG 
TGGCCCACCA 
CAGTGATGCC 
AACAACTCA6 
AGAATCTGAA 
GGCTTCCTTQ 
CAGACCAGGG 
CGTCATCAAG 
CTCAAACAGC 
CCAGCCAAGT 
AACTCCGATG 
GAAATCTTGG 
TGGATTTGTA 
CCCTGAATGT 
AACTTGGCAT 
TTTTCACTTG 
TATATGCCAT 
CTACACCTGG 
GACCTTTTTC 
TTGAAAGTCA 
TACTAATTAA 
AAGGAATAAA 
AGAGACGGTT 
TATAAAAACC 
TTAATTTTAG 
TTATGAGTAA 
TTGTATTTAA 
AATTTTATCA 
TCACGCCTGT 
GTTTGAGATC 
GCCGGACGCA 
CACTTGAACC 
TGGGTGACAG 



GGGCGGATCA 
CTACTAAAAA 
GGGAGGCTGA 
CACACCACTG 



CAGTAAGAGA 
AGCTTATAAG 
ATGCTTCATC 
AATTAAATAA 
CAGGTGTGGT 
TGAGGTCAAG 
TACAAAAATG 
GGCAGGAAAA 
CACTCCAGCC 



ACCATGAGCA 
GGCGGTAAGG 
GCCCAGGCAA 
GGAATGACTC 
ACTCTGCAAA 
CCTAAAGAAG 
ACAAACAACA 
GTCACATCCA 
CATGCTTCCC 
CATGCTAATG 
GCAGTTAATG 
CTAGCAGAGG 
AGAAAACACA 
AGCAAGAAGA 
TCTCGCTCTT 
GCCGATAATA 
GTAGCTTCCA 
GTTACCAGCC 
TCACCAAGCT 
GCTACTTACT 
GACCAGGACA 
TGCGCCCATT 
CACCCAGAAG 
GA6GAGAAAG 
GGTCGATGCC 
GTTTCCTGTT 
GAGGATGGTG 
GGATGTGAAT 
CATGACACTT 
TCCAAGAAGG 
ACAGTTCAGG 
TTTTTAGATT 
TTCCAGCTTT 
TGGCATTTAT 
AATTTCCTGA 
AATAAATAAT 
ATCTGCAAAA 
AAAAAAACTA 
GTAATAGGTG 
AATCCCAGCA 
AGCCTGGCCA 
GTGGCACGCG 
CGGGAGGGAG 
AGTGAGACTC 
TCATTCTAGT 
TGTTATATTC 
TCTCAAATTT 
ACCTATATTA 
TTTTGGCCTC 



ACTACAGTGT 
ATTTCAACAT 
ATGTAAGAAT 
ATCTTGAAGC 
GAGCATCTGC 
TAGTTAAACC 
TGGCCTACAA 
TCCCATCACC 
CTTCACCCGT 
CCAATCTTAG 
TCCCACGGCA 
GACAGAGAAG 
TTGTGGAGCG 
GACTGATTGA 
TCCGAATCCT 
CAAAGAAGGC 
CACGGAGCAT 
TCACAACTGC 
GGCAACGGCC 
CAGGATCAGT 
CTTTAGTGCA 
GTAACCAGGT 
AATTCAACTG 
GAGCCCTGTA 
AAAGGAAGAT 
TTGTGTGTGT 
AACCCTACTG 
TTCCCATAGA 
GCTTTGTATG 
ACAAGCCCCT 
AGAAGAGAAG 
CAATATTTAT 
AAAAACCAAG 
TATTACTTTT 



AGATCAAGAT 
AGCTGGGCAT 
TTCTTGAACC 
TGGTGACAGA 



CCAATCTGAA 
GGCAAT6AAA 
ATACTTATCT 
TCAGTTTTTA 
CTTTGGGAGG 
ACATGGTGAA 
CCTGTAATCC 
AGGTTGCAGT 
CGTCTCCAAA 
AGGAAAGGAC 
TTTTCTTATT 
TTGCCTTTTA 
GGCAA ATTCC 
TCATAGTTTT 
TGTGATCCCA 
CATCCTGGCC 
GGTGGGGCGT 
CAGGAGACGG 
GCAAGACTCC 



GTCACTGGTT 
GCCTCTGACA 
AGGCGATGTG 
CCAGAATAAG 
TGCACCCAAG 
TGTGCCCATT 
TAAGGCACCA 
ATCGTCTGCC 
GGCTGCCGTC 
TGCTGACCAG 
GCCCACAGTC 
AGGATCCCAG 
CTATACAGAG 
GGATACTGAA 
TGCCCAGATC 
AAATAACTCT 
GCCCGAGAGC 
AGCTGCCTTC 
AAACCAAGGA 
GGCACCAGCC 
AAGAGCTGAG 
CATCAGAGGA 
CGCTCACTGC 
TTGTGAGCTG 
CCTTGGAGAA 
AGCCTGTGGA 
TGAGACTGAT 
AGCTGGTGAC 
CTCAGTGTGT 
GTGTAAGAAA 
GAATTTGAAG 
ATGGAGTTTT 
TCTGAGGAAA 
TCCTGTATTT 
AATTCATCTT 
ATAATTATAC 
ATGCCTTAAA 
TTAAAATAGT 
AAAAATTGCT 
CCAAGGTGGG 
ACCCCAXCTC 
CAGCTACTCA 
GAGCCAAGAT 
AAAAAACT1T 



GGCCCAGCTC 
ATCTCTAGTC 
GTTCTCAGCA 
ATTAAGGGTT 
CCTGAGCCGG 
ACATCTCCTG 
CGGCCTTTTO 
TTCACCCCAG 
ACTCCTCCCC 
TCTCCATCTG 
ACCAGCGTGT 
GGTGACAGTA 
TTTTATCATG 
GACTGGCGTC 
ACTGGGACTG 
CAGGAGCCTT 
CTGGACAGCC 
AAGCCTGTAG 
GTACCTTCCA 
AACTCAGCTT 
CACATTCCAG 
CCATTCTTAG 
AAAAATACAA 
TGCTATGAGA 
GTCATCAATG 
AAGCCCATTC 
TATTATGCCC 
ATGTTCCTGG 
TGTGAAAGTT 
CATGCTCATT 
AGAAAAAGGA 
GAAAAATAAT 
TATTTGGCTT 
TATGCCCATA 
AGAATAAATT 
CTTCTTTCCT 
TTTTATCAAT 
AAATAGGATT 
TGTAGGCTGA 
TGGACCACAT 
TACTAAAAAT 
AGAGGCTGAG 
CGTACCACTG 
GCTTGTATAT 



TCTTCCCCAC 
CTAAAATGTG 
ATTTTTTCCC 
CTCTCTCTTT 



AACATGGTGA 
GCCTGTAGTC 
AAGTTGCAGT 
GGCTCTT 



CCAAAAATAA 
ATTGTTTCTG 
TTGCGCTAAG 
AAAGAGAATA 
AGGCCAAGAC 
AACCCTGTCT 
CCATGTACTT 



SEQ ID K0:S2 PAB9 Protein sequence 
Protein Accession* NP.006448 



120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2680 
2940 
3000 
3060 
3120 
3180 
3240 



1 

I 

1 HSNYSVSLVG 
61 MTHLEAQNKI 
121 NNMAYNKAPR 
181 ANANLSADQS 
241 KHTVEHYTEF 
301 DNTKKANNSQ 
361 PSWQRPNQGV 
421 ABCNQVXRGP 
481 RCQRKILGEV 
541 



11 
I 

PAPWGFRLQG 
KGCTGSLNMT 
PFGSVSSPKV 
PSALSAGKTA 
VHVPTHSnAS 
EPSPQLASLV 
PSTGRISNSA 
FLVALGKSWH 
JUALKQTWHV 
FLEAIXSYTOH 



21 
I 

GKDFNMPLTI 
LQRASAAPRP 
TSIPSPSSAF 
VNVFRQPTVT 



ASTRSHPESL 
TYSGSVAPAN 
PEEFNCAHCK 
SCFVCVACGK 
DTCFVCSVCC 



31 
I 

SSLKDGGKAA 
EPVPVQKGEP 
TPAHATTSSH 
SVCSETSQEL 
WRPRTGTTQS 



SALGQTQPSD 
OTMAYIGPVE 
PIRNNVFRLE 
ESLEGQTFFS 



41 

I 

OANVRIGDW 
KEWKPVPIT 
ASPSPVAAVT 
AEGQRRGSQG 
RSFRILAQIT 
TSLTTAAAFK 
QDTLVQRAEH 
EKQALYCELC 
DGEPYCBTDY 



51 
I 

LSIDGINAQG 
SPAVSKVTST 
PPLFAASGLH 
DSKQQNGPPR 



PVGSTGVIKS 
IPAGKRTPMC 
YEKFFAPECG 
YALPGTICHG 
AHSVNP 



60 
120 
180 
240 
300 
360 
420 
480 
540 



SEQ ID N053 PBH7 DNA SEQUENCE 

Nuclei Acid Accession t: AA431407 

Coding sequence: 1-884 (underfined sequences correspond to Stan end stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGQCCAACT GTAAAATGAC CAAAAGCATC AGGTTCCCTG CCCTGGAGCA CTGCTATACT 
GGCGGGGAGG TCGTGTTGCC CAAGGATCAG GAGGAGTGGA AAAGACGGAC GU jQL w lTC TG 
CTCTACGAGA ACTATGGGCA GTCGGAAACG GGACTAATTT GTGCCACCTA CTGGOGAATG 
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AAGATCAAGC CGGGTTTCAIT GGGGAAGGCC ACTCCACCCT ATGACGTCCA GTTTCATATG 240 

GAGGCCTCAG TTGAAAACTG CATTATTGTG AGCATGAACA CCOCTGACCC T66CAGCCAG 300 

GGCATCACAC ACAGCCTCTT GCTACAGGTC ATTGATGACA AGGGCAGCAT CCTGCCACCT 360 

AACACAGAAG GAAACATTGG CATCAGAATC AAACCTGTCA GGCCTGTGAG CCTCTTCATG 420 

TGCTATGAGG GTGACCCAGA GAAGACAGCT AAAGTGGAAI GTGGGGACTT CTACAACACT 480 

GGGGACAGAG GAAAGATGGA TGAAGAGGGC TACATTTGTT TCCTGGGGAG GAGTGATGAC 540 

ATCATTAATG CCTCTGGGTA TCGCATCGGG CCTGCAGAGG TTGAAAGCGC TTTGGTGGAG 600 

CACCCAGCGG TGGCGGAGXC AGCCGTGGTG GGCAGCCCAG ACCCGATTCG AGGGGAGGTO 660 

GTGAAGGCCT TTATTGTCCT GACCCCACAG TTCCTGTCCC ATGACAAGGA TCAGCTGACC 720 

A AGQAA CTGC AGCAQCATGT CAAGTCAGTG ACAGCCCCAT ACAAGTACCC AAGGAAGGTG 780 

GAGTTTGTCT CAGAGCTGCC AAAAACCATC ACTGGCAAGA TTGAACGGAA GGAACTTCGG 840 

AAAAAGGAGA CTGGTCAGAT GTAATCGGCA GTGAACTCAfl AAOGCACTGC ACACCTGAGG 900 

CAAATCCCTG GCCACTTTAG TCTCCCCACT ATGGTGAGGA CGAGGGTGGG GCATTGAGAG 960 

TGTTGATTTG GGAAAGTATC AGGAGTGCCA TGATTCCAAT GTTTTCCTT C TTTTAAATTA 1020 

AATTCAGTTG CTCTGCTTCC TCCAAGTCCT CTGTATCTTT AGAATTTCCC AGGTGAGCAC 1080 
TCATAACGCA AGTAATAAAA TACTGATATC AACAA 

SEQ ID NO:54 PBH7 Protein seouenca 
Protein Accession *: FGENESH predicted 

1 11 21 31 41 51 

I I I I I I 

MANCKMTKSI RFPALEHCYT GGEWLFKDQ EEWKRRTGLL LYENYGQSET GLICATYWGM 60 

KUCFGFMGKA TPPYDVQFHM EASVENCXIV SMNTADPGSQ GITHSLLLQV IDDKGSILPP 120 

NTEGNIGIRI KPVRPVSLPM CYEGDPEKTA KVECGDFYNT GDRGKMDEEG YICFLGRSDD 180 

UNASGYRIG PAEVESALVH HPAVAESAW GSPDPIRGEV VKAPIVLTPQ FLSHDKDQLT 240 
KELQQHVKSV TAPYKYPRKV EFVSELPKTI TGKIKRKELR KKETGQM 

SEQ ID NO: 55 PBJ5 DNA SEQUENCE 

Nudeic Add Accession* AF388200 

Coding sequence: 33-137 (uncMned sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I II I 

GAGAGAGGGA GGCAGAAGAG GAAGTCAGAG CGATGTGCTG TGAAATCTAC TACCGTTTGC 60 

TGGTTTTGAA AATGGAGAAA AAGAGTGAGG AACTGAGAAA CATGGATGGC CTTGGGAACG 120 

TGGAAAAGGG TCACTGAAAT GGGACGAC AT GAA CTCAAGG AGGCTATTTA TGACCATGTC 180 

ATTTGCAACA TGAAGAAAGC TTATCTGGAG TGAAAGTAAA TGAGACCAAC AGAGATAAGA 240 

GACCCGGAGA AATCCTGGTT ACACTGCTTG AATCCTGTCA GTCCTATACT GGAGTCCTGT 300 

TAATACAAAA TAATAGTAAT AATCCCTCTG TTTCTTATGT TTATGCCAAC TTCAACAAAA 360 

AGAAACTTGA CTAAGAGACA ATATAAGAAC TTAATGTGTA ATTAAGAAAG AACTCTCCAC 420 

CACGGGGAAT GTGAAAGGTA TATGAGTCCC TTTTCACGAT GCGATQTCAT GTCTTTTAAA 480 
TAAGCCATAC TTTATGTTCA ATAAAAAGAG AATAAGCAGG A 

SEQ P K056 PPff prptetn sequence 
Protein Accession #: AAK83352 

1 11 21 31 41 51 

I I I I I I 

KCCEIYYRLL VLKMEKKSEE LRNMDGLGNV EKGH 

SEQ ID N0S7 PBJ7 DNA SEQUENCE 

Nudete Add Accession*: AA876910 

Codtog sequence: 1-2064 (umJerflned sciences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGGACAGTT GCCTGCAACA TATGAGAGAC CTACTTTACC TCCTTCAGGA GCTCAGGTGT 60 

TTAAATCCAG CTACACTACT CCCTGATCCA GACTCCACTA CTCCTGTTCA TGACTGTCAG 120 

GATCTGTTGG AAACTACCAA AACTGGCCAA CCTGATCTTC AAGATGTGCC CCTAGAAAAG 180 

GCAGATGCCA CTGTGTTCAC AGATGGTAGC AGCTTCCTCG AGCAGGGAGA ACGAAAAGCT 240 

GTTTCTTTTC CACAGCCAGA TCTGCCTGAC AATCCCACAT ACTCAACAGA AGAAGAAAAA 300 

CTGGCTTCAG ATGTTGGAGC AAATAAAAAT CAGGAAGGAC GTGTATTCGC AAACACTACT 360 

TGGAGGGCCG GTACCTCCAA GGAAGTCTCC TTTGCAGTTG ATTTATGTGT ACTGTTCCCA 420 

GAGCCAGCTC GTACCCATGA AGAGCAACAT AATTTGCCGG TCATAGGAGC AGGAAGTGTC 480 

GACCTTGCAG CAGGATTTGG ACACTCTGGG AGCCAAACTG GATGTGGAAG CTCCAAAGGT 540 

GCAGAAAAAG GGCTCCAAAA TGTTGACTTT TACCTCTGTC CTGGAAATCA CCCTGACGCT 600 

AGCTGTAGAG ATACTTACCA gTTTTTCTGC CCTGATTGGA CATGTGTAAC TTTAGCCACC 660 

TACTCTGGGG GATCAACTAG ATCTTCAACT CTTTCCATAA GTCGTGTTCC TCATCCTAAA 720 

TTATGTACTA GAAAAAATTG TAATCCTCTT ACTATAACTG TCCATGACCC TAATGCAGCT 780 

CAATGGTATT ATGGCATGTC ATGGGGATTA AGACTTTATA TCCCAGGATT TGATGTTGGG 840 

ACTATGTTCA CCATCCAAAA GAAAATCTTG GTCTCATGGA GCTCCCCCAA GCCAATCGGG 900 

CCTTTAACTG ATCTAGGTGA CCCTATATTC CAGAAACACC CTGACAAAGT TGATTTAACT 960 

GTTCCTCTGC CATTCTTAGT TCCTAGACCC CAGCTACAAC AACAACATCT TCAACCCAGC 1020 

CTAATGTCTA TACTAGGTGG AGTACACCAT CTCCTTAACC TCACCCAGCC TAAACTAGCC 1080 

CAAGATTGTT GGCTATGTTT AAAAGCAAAA CCCCCTEATT ATGTAGGATT AGGAGTAGAA 1140 

GCCACACTTA AACGTGGCCC TCTATCTTGT CATACACGAC CCCGTSCTCT CACAATAGGA 1200 

GATGTGTCTG GAAATGCTTC CTGTCTGATT AGTACOGGGT ATAACTTATC TO CTTCTC CT 1260 

TTTCAGGCTA CTTGTAATCA GTCCCTGCTT ACTTCCATAA GCACCTCAGT CTCTTACCAA 1320 

GCACCCAACA AXACCTGGTT GGCCTGCACC TCAGOTCTCA CTCGCTGCAT TAATGGAACT 1380 
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GAACCAGGAC CTCTCCTCTS CGTGTTAGTT CATGTACTTC CCCAGGTATA 1GTGTACAGT 1440 

GGACCAGAAG GACGACAACT CATCGCTCCC CCTGAGTTAC ATCCCAGGTT GCACCAAGCT 1500 

GTCCCACTTC TGGTTCCOCT ATTGGCT6GT CTTAGCATAG CTGGATCAGC AGCCATTGGT 1560 

AC6GCTCCCC TGGTTCAAGG AGAAACTGGA CTAATATCCC TGTCTCAACA GGTGOATGCT 1620 

GATTTTAGTA ACCTCCAGTC TGCCATAGAT ATACTACMT CCCAGGTAGA GTCTCTGGCT 1680 

GAAGTAGTTC TTCAAAACTG CCGATGCTTA GATCTGCTAT TCCTCTCTCA AGGAGGTTTA 1740 

TGTGCAGCTC TAGGAGAAAG TOmXCTJ C TATGCCAATC AATCTGGAGT CATAAAAGGT 1800 

ACAOTAAAAA AAGTTCGAGA AAATCTAGAT AGGCACCAAC AAGAACGAGA AAATAACATC 1860 

CCCTGGTATC AAAGCATGTT TAACTGGAAC CCATGOCTAA CTACTTTAAT CACTGGQTTA 1920 

GCTGGACCTC TCCTCATCCT ACTATTAAGT TTAATTTTTG GGCCTTGTAT ATTAAATTCG 1980 

TTTCTTAATT TTATAAAACA ACGCATAGCT TCTGTCAAAC TTACGTATCT TAAGACTCAA 2040 
TATGACACCC TTGTTAATAA CTGA 

S.EQ (0 Hfr58 PPJ7 proton WW® 
Praieh Accession* FGENESH predicted 

1 11 21 31 41 51 

I I I I I I 

MDSCLQHMRD LLYLLQBLRC LNPATLLPDP DSTTPVHDCQ DLLETTKTGQ PDLQDVPLEK 60 

ADATVFTDGS SPLEQGEHKA VSFFQFDLFD NPTYSTEEEK LASDVGANKN QEGRVFANTT 120 

HRAGTSKEVS FAVDLCVLFP EPARTHEEQH NLPVTGAGSV DLAAGFGHSG SQTGCGSSKQ 180 

AEKGLGNVD? YLCPGNHPDA SCRDTYQFFC FDWTCVTLAT YSGGSTRSST LSISHVPHPK 240 

LCTRKNCKPL TITVHDPNAA QWYYGMSWGL RLYIFGFDVG THFTIQKKH. VSWSSPKPIG 300 

PLTDLGDPIP QKHPDKVDLT VPLPFLVPRP QLQQQHLQPS IHSILGGVHH LLNLTQPKLA 360 

QDCWLCLKAK PPYYVGLGVE ATLKRQPLSC HTRPRALTIG DVSGNASCLI STGYNLSASP 420 

FQATCNQSLL TSISTSVSYQ APNNTWLACT SGLTRCINGT EPGPLLCVLV HVLPQVWYS 480 

GPBGRQLIAP PELHPRLHQA VPLLVFLLAG LSIAGSAAIG TAALVQGETG LISLSQQVDA 540 

DFSNLQSAID ILHSQVESLA BWLQNCRCL OLLFLSQGGL CAALGESCCF YANQSGVIKG 600 

TVKBCVRENLD RHQQERENNI PWYQSMFNWK PWLTTLITGL AGPLLILLLS LIFGPCILNS 660 
FLNFTXQRIA SVKLTYLKTQ YDTLVNKf 

SEQIDN0S9 PCQ1 DNA SEQUENCE 

Nucleic Acid Accession #: NMJ019005 

Coding sequence: 182-1885 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I 1 I I I 

TGATGGTGGA AATTTCTTGA AACCGCTCTC GTAATTTGCC ACGTGCTGTT GCAAATATTC 60 

T6GTGAAT6A ACACAGAATC AGCATGGCTT TCCTTTGCTG AGAAATCACT GATGGGAAGT 120 

GAGACTTGTT AAACTTGAAA GTGAATGGAC CTGAGTGGAC CCTTTGATCA CATCAGTAAA 180 

CATGAGCGGT ACCAAACCTG ATATTTTATG GGCACCACAC CATGTTGATA GATTTGTTGT 240 

GTGTGACTCA GAACTAAGTC TTTATCATGT GGAATCTACT GTGAATTCAG AACTCAAAGC 300 

TGGATCTTTA CQTTTATCTG AAGACTCTGC AGCTACATTA CTGTCAATAA ATTCAGATAC 360 

ACCCTATATG AAATGTGTTG CCTGGTATCT TAATTATGAT CCTGAATGTC TGCTGGCAGT 420 

TG6ACAAGCA AATGGTCGAG TTGTACTTAC AAGCCTTGGT CAAGATCAIA ACTCAAAGTT 480 

CAAAGATTTG ATAGGAAAAG AGTTTGTTCC AAAACATGCA CGACAATGTA ATACCCTTGC S40 

CTGGAATCCA CTGGATAGTA ACTGGCTAGC TGCTGGTTTA GATAAGCACA GAGCTGACTT 600 

TTCAGTGCTA ATATGGGATA TCTGCAGCAA ATATACTCCT GATATAGTTC CCATGGAAAA 660 

AGTGAAACTT TCAGCAGGTG AAACTGAAAC AACATTATTA GTAACAAAAC CACTTTATGA 720 

GTTAGGACAO AATGATGCTT GTCTGTCTCT TTGTTGGCTT CCAOGAGACC AGAAACTTCT 780 

OCTTGCTGGT ATGCATCGTA ACCTAGCTAT ATTTGATCTT CGGAATACAA GCCAAAAGAT 840 

GTTCGTAAAT ACAAAAGCTG TTCAGGGTGT GACGGTAGAC CCATATTTCC ACGATCGTGT 900 

TGCTTCCTTC TATGAAGGTC AGGTTGCAAT ATGGGATCTT AGAAAATTTG AGAAGCCAGT 960 

TTTGACATTG ACTGAGCAAC CAAAACCCTT AACAAAAGTA GCATQGTGTC CCACTAGGAC 1020 

TGGTCTACTT GCCACTTTAA CAAGGGATAG TAATATTATT AGATTGTATG ATATGCAGCA 1080 

TACACCCACT CCCATTGGGG ATGAAACTGA ACOCACAATA ATTGAAAGAA GTGTGCAACC 1140 

TTGTGACAAT TACATTGCTT CCTTTGCGTG GCATCCAACA AGTCAAAATC GAATGATAGT 1200 

TGTAACTCCC AACCGAACAA TOTCAGACTT CACTGTTTTT GAAAGGATAT CTCTTGCCTG 1260 

GAGCCCAATT ACATCTTTAA TGTGGGCTTG TGGTCGTCAT TTATATGAAT GTACGGAAGA 1320 

AGAAAATGAT AATTCTTTAG AAAAAGATAT AGCAACGAAG ATGCGTCTTC GGGCTTTATC 1380 

AAGGTATGGA CTTGATACAG AGCAGGTGTG GAGGAACCAC ATTTTAGCTG GAAATGAAGA 1440 

TCCACAGCTC AAGTCACTCT GGTATACTCT GCACTTTATG AAGCAATACA CAGAAGATAT 1500 

GGATCAGAAA TCTCCAGGCA ACAAAGGATC ATTGGTTTAT GCAGGAATTA AATCAATTGT 1560 

AAAGTCATCG TTGGGAATGG TGGAAAGCAG CAGACATAAT TGGAGTGGGT TGGATAAGCA 1620 

AAGTGATATT CAAAACTTAA ATGAAGAGAG AATCTTAGCT TTACAGCTTT GTGGGTGGAT 1680 

AAAGAAAGGA ACGGATGTAG ACGTGGGGCC ATTTTTGAAC TCCCTTGTAC AAGAAGGGGA 1740 

ATGGGAAAGA GCTGCTGCTG TGGCATTGTT CAACTTGGAT ATTCGCCGAG CAATCCAAAT 1800 

CCTGAATGAA GGGGCATCTT CTGAAAAAGG CAGGAGATCT GAATCTCAAT GTGGTAGCAA 1860 

TGGCTTTATC GGGTTATACG GATQAQAAGA ACTCCCTTTG GAGAGAAATG TGTAGCACAC 1920 

TGCGATTACA GCTAAATAAC CCGTATTTGT GTGTCATGTT TGCATTTCTG ACAAGTGAAA 1980 

CAGGATCTTA CGATGGAGTT TTGTATGAAA ACAAAGTTGC AGTACGTGAC AGAGTGGCAT 2040 

TTGCTTGTAA ATTCCTTAGT GATACTCAGA TACATCGAAA AGTTGACCAA TGAAATGAAA 2100 

GAGGCTGGAA ATTTGGAAGQ AATTTTGCTT ACAGGCCTTA CTAAAGATGG AGTGGACTTA 2160 

ATGGAGAGTT ATGTTGATAG AACTGGAGAT GTTCAAACAG CAAGTTACTG TATGTTACAG 2220 

GGTTCACCTT TAGATGTTCT TAAAGATGAA AGGGTTCAGT ACTGGATTGA GAATTATAGA 2280 

AATTTATTAG ATGCCTGGAG GTTTTGGCAT AAACGAGCTG AATTTGATAT TCACAGGAGT 2340 

AAGTTGGATC CCAGTTCCAA GGCTTTAGCA CAAGTTTTTG TGAGTTGCAA TTTCTGTGGC 2400 

AAGTCAATCT CCTACAGCTG TTCAGCTGTG CCTCATCAGG GCAGAGGTTT TAGTCAGTAT 2460 

GGTGTGAGTG GCTCACCAAC GAAATCTAAA GTCACAAGTT G T CCTG GC T Q TCGAAAACCA 2520 

CTTCCTCGAT GTGCGCTTTG TCTCATTAAT ATGGGAACAC CAGTTTCTAG CTGTCCTGGA 2580 
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GGAACCAAAT CAGATGAAAA AGTGGACTTG AGCAAGGACA 
AACTGGTTTA CATGGTGTCA TAATTGCAGG CACGGTGGAC 
TGGTTCAGGO ACCATGCAGA GTGCCCTGTQ TCTGCATCCA 
GATACAACGG GGAATCTGGT ACCTGCAGAG ACTGTCCAGC 
AGAGAACOCT TCAAGTGTGG AGCTTTCTAG TAGGTGTCCT 
TCAGAACAAG CCATTCATGA CTTACCTGTA ATGGGAAAAT 
AAAAAAAAAA AAAAAAAAAA 

SEQ ID K0:60 PCQ1 Protein semtence 
Protein Accession #: NPJD61878 

1 11 21 31 41 51 

1 I I I I I 

MSGTKPDILW APHHVDRPW CDSBLSLYHV ESTVNSELKA GSLRLSEDSA ATLLSINSDT 60 
PYMKCVAWYL NYDPECLLAV GQAKGRWXT SLGQDEHN5KF KDLIGKEFVP KHARQCHTLA 120 
WNPLDSNWLA AGLDKHRADF SVLIWDICSK YTPDIVPMEK VKLSAGETET TLLVTKPLYE 180 
LGQNDACLSL CWLFRDQKLL LAGMHRNLAI PDLRNTSQKM FVNTKAVQGV TVDPYFHDKV * 240 
ASFYEGQVAI WDLRKFEKPV LTLTEQPKPL TKVAWCPTRT GUATLTRDS NI IRLYDKQH 300 
TPTPIGDBTB PTIIERSVQP CDNYTASFAW KPTSQNRMTV VTPKHTMSDF TVFERISLAW 360 
SPXTSLHHAC GRHLYECTEE EKDNSLEKDI ATKHRLRALS RYGLDTEQVW RNHILAGNED 420 
PQLKSLWYTL HFMKQYTEDM 0QKSP6NKGS LVYAGIKSIV KSSLGMVESS RHNWSGLDKQ 4B0 
SDIQNLKEER ILALQLCGWI KKGTDVDVGP PLNSLVQBGB WBRAAAVALP NLDIRRAIQI 540 
LNEGASSEKG RRSESQCGSN GFIGLYO 

SEQIDNCttl PDQ3DNA SEQUENCE 

r^eic Add Accession* U42359 

Coding sequence: 563-775 (underlined sequences correspond to start and stop codons) 

1 11 
I I 

TTGTACATCT TAACAACCTT 
GATCAGCCCA CAGTACACAT 
GAGTCCTGGC TTTGTAAAAT 
AAGCTGGCAT TCTGTAAAGG 
AGTGCTAAAT CTTGTAATAA 
CCATATTGTT GTATTTCATT 
TGTGATTTGG ACCATGGCAC 
CTGAGCCTCA GTTTTCCTCA 
' AAGTTGTAGT AAATTACTQT 
TTCCAGTCTT ACATTATTAT 
TACCAAAAGA CTGACACGTG 
CAAATATACT TTCTTTAACT 
AAATATTTTT TAATTCTATC 
AATAATOTAA GCCTTAATAT 
CTTACTTGAA AACTTT 

SEQ ID NCfcE PPQ3 Pfft^n gWVffKg 
Protein Accession #: AAB18375 

1 11 21 31 41 51 

I II I I I 

KGARGAPSRR RQAGRRLRYL PTGSFPFLLL LLLLCTQLGG GQKKKENLLA EKVEQLKEWS 60 
SRRSIPRHNG DKFRKFIKAP FRHYSMXVMF TALQPQRQCS VCRQANEEVQ ILANSWRYSS 120 
AFCNKLFFSM VDYDBGTDVF QQLNKNSAPT PXHXPPKGRP KRADTFDLQR IGFAAEQLAK 180 
WIADRTDVHI RVFRPPNYSG TIALALLVSL VGGLLYXRRN NLBFIYNKtfG MAMVSLCIVP 240 
AMTSGQWWNH IRGPPYAHKN PHKGQVSYTH GSSQAQFVAE SHIILVLHAA ITMGMVLLNE 300 
AATSKGDVGK RRIICLVGLG LWFFFSFLIt SIFRSKYHGY PYSDLDFE 

SEQ ID N0:63 PDG8 DNA SEQUENCE 

Nucleic Acid Accession #: AL080235 

Coding sequence: 245453 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGTCGCCGCA OCGGCCGCCT COGGCCCGCC GCCGCCCCCA GCGCCGCCGC CGCCACCGCC 60 

GGGGCGGCCA CCGCGCTGOC AGOCTACCCC GCGGCCGAGC CGCCCGGGCC GCTGTGG C TG 120 

CAGGGOGAGC CGCTGCATTT CTGCTGCCTA GACTTCAGCC TGGAGGAGCT GCAGGGCGAG 180 

CCGGGCTGGC GGCTGAACCG TAAGCCCATT GAGTCCACGC TGGTGGCCTG CTTCATGACC 240 

CTGGTCATCG TGGTGTGGAG CGTGGCCGCC CTCATCTGGC CGGTCCCCAT CATOGCCGGC 300 

TTCCTGCCCA ACGGCATGGA ACAGCGCCGG ACCACCGCCA GCACCACCGC AGCCACCCCC 360 

GCCGCAGTGC CCGCAGGGAC CACCGCAGCC GCCGCCGCCG CCGCCGCTGC CGCCGCCGCC 420 

GCGGCOGTCA CTTCGGGGGT GGCGACCAAG TGACCCGCTC CGCTCCTCCC TGTGTCCGTC 480 

CTGTGTCCGC GCGCGOGGGT GCCTTTOCCG CCGGGOACTC GGCCGGTGTG CTTCGTGCTG 540 

TAGTTATCGT TAGTTCCTCT TCOCGAGATG GGGCCGCCGA GAGACCOCAG CGCCTTTGAA 600 

AAGCAAGGTT TGTGCTQOGC TTCCAGTTCC GAAAAGCAGA TGTTTAAGCC CTTGGACTGA 660 

GGGTGGGATC GCAGCTCCGA AGACGGAGAG GAGGGAAATG GGGCCCTTTC CCCTCTATTG 720 

CATCCCCCTG CCCGACTOCT TCCOOGCAOC CACGT GCCCT AGATTCATGG CAGAAAATGA 780 

CCAAATCCTG TGTATTTGTT TTATATATTT AATAACTGTT TTAAATGAAA GTTTTAGTAA 840 

AAAAAATACA AAACAAAAAG A TTAAATT GC TATTGC TGTA GTAAG AGAAG CTCTTTGTAT 900 

CTGAACATAO TTGTATTTGA AATTTCTGGT TTTTTAATTT ATTTAAAATT GGGGGGAGGG 960 



AAAAATTAGC CCAATTTAAC 2640 

ATGCTGGACA TATGCTTAGT 2700 

CGTGTAAATQ TATGCAGTTG 2760 

CATAAAATGT TAOCACCTTA 2820 

TCATAGCTCA GAAACATACC 2880 

AAATCATTCT ATCAGAAAAA 2940 



21 31 41 51 

I I I I 

AAGCTGTACA AATAGANCAA TAATATCTAA ATGGTGTGAT 60 

CATTGATGAG AATTTCACTG GTCTCAACCT TTCTCATGCT 120 

GACTTATAAA GGTGCAAGGA TTTAGAGATG ATTAAGAGAT 180 

CACCATCGTC TATCCCCTGT CTTATCTAGA TAAAGAATGT 240 

TATTGTACAA ATGGAAATTC AATCTTAAGG ATTATTTTTT 300 

GTGGTGTATT GGAAAGTGAT CTGGACTTTG AGTGAGAAGA 360 

TTAAAAACTC TATAACCTCA GGCAAGTCTT TTAATCTTCT 420 

TTTTTCAAAT ATAGAGAGTA TAACATTTAT CTCATAAGAC 480 

TTTACAAATG TAAGATAACT TTTAACTGTG AGATTCCATA 540 

_GTTTATCTGC CACAGGGAGA AGTCCTCAGA TAAAAATGTC 600 

GAGTTAATCA TTTGACAGAT GCAAATGCTT CCACCCCCAA 660 

TCTGTGTGGG TATCACTTAG GGAAAAAAAG GCAGGCAACA 720 

TTAGGAAAAA TTGTAGNCAA ATCTTTTTNT CCCATTAACA 780 

TCAAGGGGTA ATAAAAATAC AAAGTCTTCC AAACAGGTAA 840 
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CATGGGAAGG ATTTAACACC GATATATTGT TACCGCTGAA AATGAACTTT ATGAACCTTT 1020 
TCCAAGTTGA TCTATCCAGT GACGTGGCCT GGTGGGCGTT TCTTCTTGTA CTTATGTGGT 1080 
TTTTTGGCTT TTAATACAGA CATTTTCCTC CAAAAAAAAA AAAAAAAAGG 

5 SEQ m NOS4 PDQ8 Protein sequence 
Prated Accession #: CAB45781 

1 11 21 31 41 51 

m 1 I 1 I 1 I 

1U GRRTGRLRPA AAPSAAAATA GAPTALPAYP AAEPPGPLWL QGEPLHPCCL DFSLEELQGB 60 
PGWRLNRKPI ESTLVACFMT LVZWWSVAA LIWPVPIIAG FLPNGMEQRR TTASTTAATP 120 
AAVPAGTTAA AAAAAAAAAA AAVTSG VATK 

% - SEQ ID NO: 55 PDM1 DNA SEQUENCE 

15 Nucleic Add Accession* NMJW6765 

Cafing sequence: 149-1 195 (undejfined sequences correspond to start and stop codons) 



, 1 11 21 31 41 51 

20 | | || | | 

CGGCCGCGGC OCGGGTCCCT OGCAAAQCCO CT6CCATCCC GGAGGGCCCA GCCAGCGGGC 60 

TCCCGOAGGC TCGCC6G6CA 60CGTGGT6C GCGGTAGGAG CT6G6C6CGC ACGGCTACCG 120 

CGCGTGGAGG AGACACTGCC CTQCCGC GAT CG SOGCCCGG GGCGCTCCTT CACCCCGTAG 180 

_ _ GCAAGC6G66 0GG06GCTGC GGTACCTGCC CACCGGGAGC TTTCCCTTCC TTCTCCTGCT 240 

25 GCTGCTGCTC TGCATCCAGC TCGGGGGAGG ACAGAAGAAA AAGGAOAATC TTTTAGCTGA 300 

AAAAGTAGAG CAGCTGATGG AATGGAGTTC CAGACGCTCA ATCTTCCQAA TGAATGGTGA 360 

TAAATTCCGA AAATTTATAA AGGCACCACC TCQAAACTAT TCCATGATTG TTATGTTCAC 420 

TGCTCTTCAO CCTCAGCGGC AGTGTTCTGT GTGCAGGCAA GCTAATGAAG AATATCAAAT 480 

ACTGGG6AAC TCCTGGCGCT ATTCATCTGC TTTTTGTAAC AA6CTCTTCT TCAGTATGGT 540 

30 GGACTATGAT QAGGGQACAG ACGTTTTTCA GCA6CTCAAC ATSAACTCTO CTCCTACATT 600 

CAYCCATTTW CCTCCAAAAG GCAGACCTAA GAGAGCTGAT ACTTTTGACC TCCAAAGAAT 660 

TGGATTTGCA GCTGAGCAAC TAGCAAAGTG 6ATTGCT6AC AGAACGGATG TTCATATTCG 720 

GGTTTTCAGA CCACCCAACT ACTCTGGTAC CATTGCTTTG GCCCTGTTAG TGTCGCTTGT 780 

c TGGAGGTTTG CTTTATTNGA GAAGGAACAA CTTGGAGTTC ATCTATAACA AGACTGGTTG 840 

JJ GGCCATGGTG TCTCTGTGTA TAGTCTTTGC TATGACTTCT GGCCAGATGT GGAACCATAT 900 

CCGTGGACCT CCATATGCTC ATAAGAACCC ACACAATGGA CAAGTGAGCT ACATTCATGG 960 

GAGCAGCCAG GCTCAGTTTG TGGCAGAATC ACACATTATT CTGGTACTSA ATGCCGCTAT 1020 

CACCATGGGG ATGGTTCTTC TAAATGAAGC AGCAACTTCG AAAGGCGATG TTGGAAAAAG 1080 

. _ ACGGATAATT TGCCTAGTGG GATTGGGCCT GGTGGTCTTC TTCTTCAGTT TTCTACTTTC 1140 

40 AATATTTCGT TCCAAGTACC ACGGCTATCC TTATAGTGAT CTGGACTTTG AGTGAGAAGA 1200 

TGTGATTTGG ACCATGGCAC TTAAAAACTC TATAACCTCA GCTTTTTAAT TAAATGAAGC 1260 

CAAGTGGGAT TTGCATAAAG TGAATGTTTA CCATGAAGAT AAACTGTTCC TGACTTTATA 1320 

CTATTTTGAA TTCATTCATT TCATTGTGAT CAGCTAGCTT ATTCTTGTGT ACTTTTTTTA 1380 

A _ AACTGTGGGT TTTCCTAGTA AATTTAATTT ACAGAAATCA ATGGTAGCAT TTAGTAATCT 1440 

45 ACAAAGGAAA TATCAAAGTG TTTTTCAAGC CTGTTATATY CAGTGTGTKC CACAGGATTG 1500 
CAATAAATGA CAATGTAATT A 



SEQtDKO:66 F^,., . »y,^^ w ^„w, 
SO Protein Accession t. NP_006756 

1 11 21 31 41 51 

I I I I I t 

MGARGAPSRR RQAGRRLRYL PTGSFPPLLL LLLL&QLGG GQKKKENLLA EKVEQLMEWS 60 
55 SRRSIFRHNQ DKFRKPIKAP PRNYSMTVMF TALQPQRCCS VCRQANEEYQ XLAHSWRYSS 120 
AFCNKLFFSM VDYDEGTDVF QQLNMNSAPT PXHXPPKGRP KRADTFDLQR IGFAAEQIAK 180 
WIADRTDVHI RVFRPPNYSG TIALALLVSL VGGLLYXRRN NLEFIYNKTG KAMVSLCIVF 240 
AMTSGQHWHH JRGPPYAHKN PHNGQVSYIH GSSQAQFVAB SHULVUOVA ITMGMVLLNE 300 
AATSKG0VGK RRIICLVGLG LWFFFSFLL SIFRSKYHGY PYSDLDFE 



60 
65 



SEQ ID NO:67 P0M2 DNA SEQUENCE 

Nudafc Acid Accession #: NM_000947 

Oxfing sequence: 88-1617 (underlined sequences correspond to start and stop cottons) 



1 11 21 31 41 51 

I I I I I I 

GGTTTCATAT GAACTCTCCC GCCACCCGGG AACAGCTGGC TGCCAOCGTT TGTGTTTTCC 60 

70 GAGTTTGTAT TCTTGCAGGT GACCAAGATG GAGTTTTCTG GAAGAAAGCG GAGGAAGCTG 120 

AGGTTGGCAO GTGACCAGAO GAATGCTTCC TACCCTCATT GOCTTCAGTT TTACTTGCAG 180 

CCACCTTCTG AAAACATATC TTTAACAGAA TTTGAAAACT TGGCTATTGA TAGAGTTAAA 240 

TTGTTAAAAT CAGTTGAAAA TCTTGGAGTG AGCTATGTGA AAGGAACTGA ACAATACCAG 300 

AGTAAGTTGG AGAGTGAGCT TCGGAAGCTC AAGTTTTCCT ACAGAGAGAA GCTAGAAGAT 360 

75 GAATATGAAC CACGAAGAAG AGATCATATT TCTCATTTTA TTTTGCGGCT TGCTTATTGC 420 

CAGTCTGAAG AACTTAGACG CTGGTTCATT CAACAAGAAA TGGATCTOCT TCGATTTAGA 480 

TTTAGTATTT TACCCAAGGA TAAAATTCAG GATTTCTTAA AGGATAGCCA ATTGCAGTTT 540 

GAGGCTATAA GTGATGAAGA GAAGACTCTT GGAGAACAGG AQATT OTTGC CTCATCACCA 600 

AGTTTAAGTG GACTTAAGTT GGGGTTCGAG TCCATTTATA AGATCCCTTT TGCTGATGCT 660 

80 CTGGATTTGT TTCGAGGAAG GAAAGTCTAT TTGGAAGATG GCTTTGCTTA CGTACCACTT 720 
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AAQGACATTO 
TTAACAGCCA 
CACCTCAGTC 
TCTTTAGATC 
CATAAAGCCT 
TTTCTGAAGG 
ATCAAAGGAA 
AGCTTTGGAA 
CTGTCCAATC 
GAGCTGCTGA 
TTGGATTTAG 
CACAATGTGG 
CAACGTATTC 
CAACCCAAAC 
TCCTCTCTGG 
GTTTTATAAC 
TTGAAAAAGG 
AGCCTTGACC 
CACAGGTGTG 
GTCTCCCTAT 
AGCG TCCCAQ 
TAACCTTTTC 
TTATTAGGAA 
AAGGAAAGAG 
TTTTAGGAGA 
AACAACTTTT 
ATTTTTGTTA 



TQGCAATCAT 
GGTCCTTGCC 
ATTCCTACAC 
AGATTGATTT 
TGCGGGAAAA 
GCATTGGTTT 
AGATGGATCC 
AGGAAGGCAA 
CACCAAGCCA 
AGCAAAAGTT 
TAAAGGGGAC 
ATGATTGTGG 
TAAATCGTGG 
CAAGTGTOCA 
AAATGGATAT 
CCTTTTTCCT 
GTTTCACTGT 
TTCCCAGCTC 
CACCTCATAT 
GTTGCCCAGG 
AGTGCTGGGA 
GTTTAACTTC 
AGGAGGTTTG 
GAGGAGTTTC 
TAAAAACAGC 
GTTTTAACTC 
ATAAATATCA 



CCTGAATGAA 
TGCTGTGCAG 
T GGCCAAG AT 
GCTTTCTACC 
TCACCATCTT 
AACTTTGGAA 
AGACAAGTTT 
GAGGACAGAC 
AGGGGATTAT 
GCAGTCATAC 
ACATTACCAG 
CTTTTCTTTG 
TAAAGACATA 
GAAAACCAAG 
GGAAGGACTA 
CAATAGCCTG 
CACCAAGGCT 
AAGTGATCCT 
CCAGATAATT 
CAGATCTCAG 
TTACAGTTGT 
TCTCTTCACT 
AGGTAACAAC 
TATTAAAATC 
TTTGGGGACT 
TTAATCACTT 
AAGTGT 



TTTAGAGCCA 
TCTGATGAAA 
TACAGTACCC 
AAATCCTTCC 
CGTCATQQAO 
CAGGCATTGC 
GATAAAGGTT 
TATACACCTT 
CATGGGTGCC 
AAGATCTCTC 
GTAGCCTGTC 
AATCATCCTA 
AAGAAGGAAC 
GATGCATCAT 
GAAGATTACT 
TTTCCTCTTT 
TAGTGCAGTG 
OCTAOCTCAG 
TTTTTCAATT 
ACTCCTGGGC 
GAGCCACTGT 
GCATCCCAAT 
AGAGACTTTC 
TGTCACTTGA 
GGTTAAAGTC 
TGTAATTTTG 



31 
I 

QPPSENISLT 
DEYEPRRRBH 
FEAISDEEKT 
LKDIVAIILN 
ISLDQH3LLS 
FIKGKMDPDK 
PELLKQKLQS 
SQRILNGGKD 



AACTGTCCAA 
QACTTCAGCC 
AGGGAAATGT 
CACCTTGCAT 
GCCGAAT6GA 
AGTTCTGGAA 
ACTCTTACAA 
TCAGTTGCCT 
CATTCCGTCA 
CTGGAGGGAT 
AAAAATACTT 
ATCAGTTCTT 
CTATCCAACC 
CTGCTCTGGC 
TTAGTGAAGA 
TTAAGATTTT 
ACACAATTAC 
CCTCCCAAGT 
TTTTTTTGTA 
TCAAGCGATC 
GCCTGGCCTT 
CCATCTACAG 
ACTATATTTT 
GTGATGTCAT 
CCCCAGAAAC 
ACTCAATCCT 



GGCTTTGGCA 
TCTGCTCAAT 
TGGGAAGATT 
GCGTCAGTTA 
QTATGGCCTA 
GCAAGAATTT 
CATCCGTCAC 
GAAGATTATT 
CAGTGATCCA 
AAGCCAGATT 
TGAGATGATA 
TTGTGAGAGC 
AGAAACTCCT 
CTCTTTAAAT 
TTCTTAGGCA 
GCCTTTGTTG 
AGCTGATTGC 
AGTTAGGACA 
GAGGTGGGGO 
CTCACACCTC 
TTTTTTTTTT 
GCATGCACAC 
GCTTTGACAG 
TTAAGTCCTA 
TACAATAAAG 
TTTCTGGACC 



780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 



Protein Accession «: NPJW0938 

l 11 21 

I I i 

MEFSGRKRRK LRLAGDQRNA SYPHCLQFYL 
VSYVKGTBQY QSKLESBLRK LKFSYREKLE 
IQQQ4DLLRF RFSILPKDKI QDFUCDSQLQ 
ESIYKIPFAD ALDLFRGRKV YLEDGFAWP 
QSDERLQPLL NHLSHSYTGQ DYSTQGNVGK 
LRHGGRMQYG LFLKGIGLTL EQALQFWKQB 
DYTPFSCLKI ILSNPPSQGD YHGCPPRHSD 
QVACQKYFHJ IHNVDDCGPS LNHPMQPPCE 
KDASSALASL ] 



41 


51 




1 

EFENLAIDRV 


1 

KLLKSVENLG 


60 


ZSEFZLRLAY 


CQSEELRRWF 


120 


LKEQBXVASS 


PSLSGLKLGP 


180 


EFRAKLSKAL 


ALTARS LPAV 


240 


TKSFPPCMRQ 


LHKALRENHH 


300 


PDKGYSYNIR 


HSFGKEGKRT 


360 


YKISPGGISQ 


ILDLVKGTKY 


420 


IKKEPIQPET 


PQPKPSVQKT 


480 



SEQ (0 N&69 PDM3 DNA SEQUENCE 

Nucleic Acid Accession!: NM.024840 

Cooing sequence: 108491 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

I ! I I I I 

AATTCATACA GGAGAGAAGT CATATATATG CAGTGATTGT GGAAAAGGCT TCATCAAGAA 60 

GTCTCGGCTC ATTAATCATC AGAGAGTTCA TACAGGAGAG AAACCA CATQ GATGCAGCCT 120 

GTGTGGGAAG GOCTTCTCCA AAAGGTCCAG GCTCACTGAA CACCAGAGAA CTCATACAGG 180 

AGAGAAGCCC TATQAATGCA CTGAATGTGA CAAAGCATTC CGCTGGAAAT CACAGCTCAA 240 

TGCACATCAG AAAGCTCACA CAGGAGAGAA GTCATATATA TGCCGTGATT GTGGAAAAGG 300 

CTTCATTCAG AAGGGAAATC TCATTGTACA TCAGCGAATT CATACTGGAG AAAAACCCTA 360 

TATATGCAAT GAATGTGGAA AAGGCTTCAT CCAAAAGGGC AACCTCCTTA TTCATCGACG 420 

TACTCACACT GGAGAGAAAC CCTATGAATG CAATGAATOT GGQAA AGGCT TCAGCCAGAA 480 

GACATGTTTA AT ATCCCATC AGAGATTTCA CACAGGAAAG ACACCCTTTG TATGTACTGA 540 

GTGTGGAAAA TOCTGCTCAC ACAAGTCAGG TCTCATTAAC CACCAGAGAA TTCACACAGG 600 

AGAGAAACCC TATACATGCA GTGACTGTQG GAAAGCTTTC AGAGATAAAT CATGTCTCAA 660 

CAGACATCGG AGAACTCATA CAGGGGAGAG ACCGTATGGA TGCTCTGATT GTGGGAAAGC 720 

TTTCTCCCAC TTGTCATGCC TTGTTTATCA TAAGGGAATG CTGCATGCAA GAGAGAAATO 780 

TGTAGGTTCA GTCAAATTGG AAAATCCTTG CTCAGAGAGT CATAGCTTAT CACATACACG 840 

TGATCTCATA CAGGATAAAG ACTCTGTTAA CATGGTGACT CTGCAGATGC CTTCTGTGGC 900 

AGCTCAGACC TCATTAACTA ACAGTGCGTT CCAAGCAGAG AGCAAAGTAG CCATTGTQAG 960 

CCAGCCTGTT GCCAGAAGTT CAGTCTCAGC AGATAGTAGA ATTTGCACAG AATAAAAACC 1020 

ATATGAATGC AGTGAATGTG GTAGTGCTTT CAGTGATCAA TTACATCATA TGTCACAAAA 1080 

AACACAGAGG AACAAACTGA TATATTCAAG GTGGAAAGCC CTTGAATAAA ACCTTATGGC 1140 

TAATAAGCAT ATACTCAGAG AAAAATAGTA TGAAGTGGAG ACTGGGAAAT TCTTTTATQG 1200 

GAAGATAGAT CTTCTCATCA GTGACCATAG ATCACATCTT CAGTGAGCTT ATAGTTGGTA 1260 

GAAATATAAT GATCATGGAA AAGTCCTTGT TCAGAAACAG TACGCCAGTA GGTATC AGGG 1320 

GGTTTACACA GGAGAGAAAC TTTTGGAAGA CCTTTGAAGG CTATGAATGT GGCAGGGTTG 1380 

CTAGTGGTAC ATTCTGCCTT ATCCTCAGAG GGAATCATAT AGAAATAAAA CTATGAAAAT 1440 

GTAACTAGAA CATCTTCATC AAAATATGAA AGAACACACG AAGCAAATAA GCCCTGTGAA 1500 

AAGGAGTATT TTAGAGATTT CGATCAGAAA TCTAACATCA TTATATGGCA GATAATATAC 1560 

AGGATGTGTA TTTTAGGAGA ATATACCTTG AATCACTAGT TGATATOTCA ATGACTAATT 1620 

AAAAGGGGTT GTCAGTGTTA CACATCATTG GTTAAATTTA TAGCACAATG TACCTCTTCC 1680 

CCCTTTTTTG ATAAGAGTCT TCTATTCCCA ACCAAGATCA TTATATGATT AOCTCTTOTO 1740 

TTTCTTTGAT TCCAAATTTC TTCACTTGTT ATTTCAGACT ACTGAAGCTC TTCAAAAGGA 1800 
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AAAATGTATT TAATTTAATA ATCTAACACA ACAAQTTTGQ ATGTGTTTAA CTTTATAAAT 1860 
AATCACCCCA GAGGAATGAA GTTCAAAACT TGT6AATAAC C 



gEflPPOTPPPMgF 

Protein Accession* NPJD79U6 

1 11 21 31 41 51 

I I I III 

MDAACVGRPS PKGPGSLNTR BLIQBRSPMN ALKVTKHSAG NHSSMHIRKL TQERSHIYAV 60 
IVKKASFRRB ISLYXSEFIL EKNPIYRMNV EKASSKRATS LFIDVLTLER NPHNAMNVGK 120 
ASARRHV 

SEQID NfcTI PDM8 DNA SEQUENCE 

Nucleic Add Accession ft NMJU8455 

Coding sequence: 341-955 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

AATTTCGGCA CGGGGGGGAG GCACAGTGAG TCCACTGGGG CACG6CAGCG TCTAAGCCAC 60 

AAGCCGACTG ACATAAGCCA GGTCCTAACG GAGCCTATGT GTAAGTCCAC TACTGGTGCA 120 

AGQTTGCACA CTTCTAAGAA GAGOQOCGTG GGGGGCTCGG CGACCTTCGC TTCAGTCGCT 180 

CCOCCGTGCA GTOCCCTGTQ CC CAAGACAC AGCC TGATGC TTGTQCTCCO GTQG QCGQAC 240 

TTGGAGGCGG CGGGAACTGC AATTGGTGGC TTTGAAGGGC GGCGAQCGQO AACAOCTCTT 300 

GAGGAGTGAG ACTGCAGGAG ATGTGGGCOS TGCCAAAGAG ATQGATGAGA CTGTTGCTGA 360 

GTTCATCAAG AGGACCATCT TGAAAATCCC CATGAATGAA CTGACAACAA TCCTGAAGGC 420 

CTGGQATTTT TT GTCTG AAA ATCAACTGCA GACTGTAAAT TTCCGACAGA GAAAGGAATC 480 

TGTAGTTCAG CACTTGATCC ATCTGTGTGA GGAAAAGCGT GCAAGTATCA GTGATGCTGC 540 

CCTGTTAGAC ATCATTTATA TGCAATTTCA TCAGCACCAG AAAGTTTGGG ATGTTTTTCA 600 

GATCAGTAAA GGACCAGGTG AAGATGTTGA CCTTTTTGAT ATGAAACAAT TTAAAAATTC 660 

GTTCAAGAAA ATTCTTCAGA QAGCATTAAA AAATOTGACA OTCAQCTTCA GAGAAACTGA 720 

GGAGAATGCA GTCTGGATTC GAATTGCCTG GGGAACACAG TACACAAAOC CAAACCAGTA 780 

CAAACCTACC TACGTGGTGT ACTACTCCCA GACTCCGTAC GOCTTCACGT CCTCCTCCAT 840 

GCTGAGGCGC AATACACCGC TTCTGGGTCA GGAGTTAGAA GCTACTGGGA AAATCTACCT 900 

CCQACAAGAG GAGATCATTT TAGATATTAC CQAAATGAAG AAAGCTT6CA ATTAGTQAAC 960 
ATGAAAGGAA AATAAAAA1T CCTCACAGTC AAAAAAAAAA AAAAA 

SEQ ID NO:72 PPM9 PnWft sequence; 
Protein Accession ft NP_060925 

1 11 21 31 41 51 

I I I I I I 

MBETVAEFIK RTILKIPMNE LTTILKAWDP LSENQLQIVN FRQRKESWQ HLIHLCEEKR 60 
ASISDAALLD IIYMQFHQHQ KVWDVPQHSK GPGEDVDLFD MKQFKNSFKK ILQRALKNVT 120 
VSFRETEENA VWIRIAWSTQ YTKPNQYKPT YWYYSQTP5T APTSSSMLRR NTPLLGQELE 180 
ATGKEYLRQE EIILDITEMK KACN 

x SEQ 10 NO:73 PDM9 DNA SEQUENCE 

Nucleic Acid Accession ft NM_016192 

Coding sequence: 1-1 125 (underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I I I I _ . 

ATGQTGCTGT GGGAGTCCCC GCGGCAGTGC AGCAGCTCGA CACTTTGCGA GGGCTTTK5C 60 

TG GCTGC TGC TGCT GQCOGT CATGCEACTC ATCGTAGCCC GCCCGGTGAA GCTCGCTGCT 120 

TTCCCTACCT CCTTAAGTGA CTGCCAAACG CCCACCGGCT GGAATTGCTC TGGTTATGAT 180 

GACAGAGAAA ATGATCTCTT CCT CT GT G AC ACCAACACCT GTAAATTTGA TGG66AATQT 240 

TTAAGAATTG GAGACACTGT GACTTGCOTC TGTCAOTTCA AGTGCAACAA TGACTATGTG 300 

CCTGTGTGTG GCTCCAATGG GGACAGCTAC CACAATGAGT GTTACCTGCG ACAGGCTGCA 360 

TGCAAACAGC AGAGTGAGAT ACTTGTGGTQ TCAGAAGGAT CATGTGCCAC A6ATGCAG6A 420 

TCAGGATCTG GAGATGGAGT CCATGAAGGC TCTGGAGAAA CTAGTCAAAA GGAGACATCC 480 

ACCTGTGATA TTTGCCAOTT TGGTGCAGAA TOTGACQAAO ATGCCGAGGA TGTCTGGTGT 540 

GTGTGTAATA TTGACTGTTC TCAAACCAAC TTCAATCCCC lXJT G CGCTTC TGATGGGAAA 600 

TCTTATGATA ATGCATGCCA AATCAAA6AA GCATC G TGTC AGAAACAGOA GAAAATTSAA 660 

GTCATGTCTT TGGGTCGATG TCAAQATAAC ACAACTACAA CTACTAAGTC TGAAGATGGG 720 

CATTATGCAA GAACAGATTA TGCAGAGAAT GCTAACAAAT TAGAAGAAAG TGCCAGAOAA 780 

CACCACATAC CTTGTCCGGA ACATTACAAT GGCTTCTGCA TGCATGGGAA GTGTGAGCAT 840 

TCTATCAATA TGCAGGAGOC ATCTTGCAGG TGTGATGCTG GTTATACTGO ACAACACTGT 900 

GAAAAAAAGO ACTACAGTGT TCTATACGTT GTTCCCGGTC CTGTACGATT TCAGTATGTC 960 

TTAATCGCAG CTGTGATTGG AACAATTCAG ATTGCTGTCA UXJTGTGTGGT GGTCCTCTGC 1020 

ATCACAAGGA AATGCCCCAG AAGCAACAGA ATTCACAGAC AGAAGCAAAA TACAGGGCAC 1080 
TACAGTTCAG ACAATACAAC AAGAGCGTCC ACGAGGTTAA TCTGA 
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PCT/US01/32045 



SfiQ IP Nft74 PDM9 Prfft^n rewfflw; 
PratetoAccesstanl: NP.057276 

- 1 11 21 31 41 51 

5 I I | I | | 

1 MVLWBSPRQC SSWTLCEGFC WLLLLFVMLL IVARPVKLAA FPTSLSDCQT PTGWNCSGYD 60 

61 DRENDLFLCD TNTCKFDGEC LRIGDTVTCV CQFKCNNDYV FVCGSNGESY QNECYLRQAA 120 

121 CKQQSBILW SEGSCATDAG SGSGDGVHEG SGETSQRETS TCDICQFGAB CDKDAEDVWC 180 

181 VQUDCSQIN FNPLCASDGK SYDNACQIKE ASCQKQEKIE VMSLGRCQHN TTTTT KSB PG 240 

10 241 HYARTDYAEN ANKLEESARB HHTPCPEHYN GFCHHGKCEH SZNMQEPSCR CDAGYTGQHC 300 

301 EKKDYSVLYV VPGPVRFQYV LIAAVIGTIQ IAVICVWLC ITRKCPRSNR IHRQKQNTGH 360 
361 YSSDNTTRAS TRLI 

, - SEQ©NO:75 PD01 ONA SEQUENCE 

15 Nuctefc Add Accession #: NM_0 14324 

Coding sequence: 8S-1237 (underlined sequences correspond to start and stop codons) 



20 
25 
30 
35 
40 
45 
50 
55 



1 11 21 31 41 SI 

I I I III 

GGCGCGGGGA TTGGGAGGGC TTCTTGCAGG CTGCT66GCT G6GGCTAA0G GCTGCTCAGT 60 

TTCCTTCAGC G6GGCACTGG GAAQCGC CAT GGCACTGCAG GGCATCTCGG TCGTGGAGCT 120 

GT OC GQCCTG GCOCCGGGCC GTNTCTGTGC TATGQTCCTG GCTQACTTOQ GGGCGCGTGT 180 

GGTACGCGTG GACCGGCCCG GCTCCCGCTA CGACGTGAGC CGCTTGGGCC G0GGCAAGC6 240 

CTCGCTAGTG CTGGACCTGA AGCAGCCGCG GGAGCCGCGT GCTGCGGCGT CTGTGCAAGC 300 

GGTCGGATGT GCTGCTGGAG CCCTTCCGCC GCGGTGTCAT GGAGAAACTC CAQCTGGGCC 360 

CAGAGATTCT GCAGCGGGAA AATCCAAGGC TTATTTATGC CA6GCTGAGT GGATTTGGCC 420 

AGTTCAGGAA AGCTTCTGCC GGTTAGCTGG CCACGATATC AACTATTTGG CTTTGTCAGG 480 

TGTTCTCTCA AAAATTGGCA GAAGTGGTGA GAATCCGTAT GCCCCOCT6A ATCTCGTGGC 540 

1QA CTTTOCT G OTOGT GGCC TTATGTGTGC ACT6G6CATT ATAATGGCTC TTTTTGACCG 600 

CACACGCACT GACAAGGGTC AGGTCATTGA TGCAAATATG GTGGAAGGAA CAGCATATTT 660 

A AQTTC TTTT CTGTGGAAAA CTCAGA AATC GAGTCTGTGG GAAGCACCTC GAGGACAGAA 720 

CATGTTGGAT GGTGGAGCAC CTTTCTATAC GACTTACAGG ACAGCAGATG GGGAATTCAT 780 

GGCTGTTGGA GCAATAGAAC CCCAGTTCTA CGAGCTGCTG ATCAAAGGAC TTGGACTAAA 840 

GTCTGATGAA CTTCCCAATC AGATGAGCAC GGATGATTGG CCAGAAATGA AGAAGAAGTT 900 

TGCAGATGTA TTTGCAAAGA AGACGAAGGC AGAGTGGTGT CAAATCTTTG AOGGCACAGA 960 

TGCCTGTGTG ACTCCGGTTC TGACTTTTGA GGAGGTTGTT CATCATGATC ACAACAAGGA 1020 

ACGGGGCTCG TTTATCACCA GTGAGGAGCA GGACGTGAGC CCCCGCCTTG CACCTCTGCT 1080 

GTTAAACACC CCAGCCATCC CTTCTTCCAA AGGGGATCCT TTCATAGGAQ AACACACTGA 1140 

GGAGATACTT GAAGAATTTG GATTCAGCCG AGAAGAGATT TATCAGCTTA ACTCAGATAA 1200 

AATCATTGAA AGTAATAAGG TAAAAGCTAG TCTCTAACTT CCAGGCCCAC GGCTCAAGTG 1260 

AATTTGAATA CTGCATTTAC AGTGTAGAGT AACACATAAC ATTGTATGCA TGGAAACATG 1320 

GAGGAACAGT ATTACAGTGT CCTACCACTC TAATCAAGAA AAGAATTACA GACTCTGATT 1380 

CTACAGTGAT GATTGAATTC TAAAAATGGT TATCATTAGG GCTTTTGATT TATAAAACTT 1440 

TGGGTACTTA TACTAAATTA TGGTAGTTAT TCTGCCTTCC AGTTTGCTTG ATATATTTGT 1500 

TGATA7TAAG ATTCTTGACT TAZATTTTGA ATGGGTTCTA GTGAAAAAGG AATGATATAT 1560 

TCTTGAAGAC ATCGATATAC ATTTATTTAC ACTCTTGATT CTACAATGTA GAAAATGAGG 1620 

AAATGCCACA AATTGTATGG TGAT AAAAGT CACGTGAAAC AGAGTGATTG GTTGCATCCA 1680 

GGCCTTTTGT CTTGGTGTTC ATGATCTCCC TCTAAGCACA TTCCAAACPP TAGCAACAGT 1740 

TATCACACTT TGTAATTTGC AAAOAAAAGT TTCACCTGTA TTGAATCAGA ATGCCTTCAA 1800 

CTGAAAAAAA CATATCCAAA ATAATGAGGA AATGTGTTGG CTCACTACGT AGAGTCCAGA 1860 

GGGACAGTCA GTTTTAGGGT TGCCTGTATC CAGTAACTCG GGGCCTQTTT CCCCGTGGOT 1920 

CTCTGGGCTG TCAGCTTTCC TTTCTCCATG TGTTTGATTT CTCCTCAGGC TGGTAGCAAG 1980 

TTCTGGATCT TATACCCAAC ACACAGCAAC ATCCAGAAAT AAAGATCTCA GGACCCCCCA 2040 
AAAAAAAAAA AAAAAAAAAA AAAAAAAA 

Protein Accession*: NP_055139 



1 11 21 31 41 51 

60 | I.I I I I 

1 MALQGISWE LSGLAPGRXC AMVLADFGAR WRVDRPGSR YDVSRLGRGK RSLVLDLKQP 60 
61 REPRAAASVQ AVGCAAGALP PRCHGETPAG FRDSAAGKSK AYLCQAEWIW FVQESFCRLA 120 
121 GRDZNYLALS GVLSKIGRSG ENPYAPLNLV ADFAGGQLMC ALGIIHALFD RTRTDKGQVI 180 
181 DANKVEGTAY LSSFLWKTQK SSLWEAFRGQ NMLDGGAPFY TTYRTADGEP MAVGAIEPQP 240 
65 241 YELLXKGLGL KSDBLPNQMS TDDWPEMKKK FADVPAKKTK ABWCQIFDGT DACVTPVLTF 300 

301 EEWHHDHNK ERGSFITSEE QDVSPRLAPL LLNTPAIPSS KGDPFIGEHT EEILEBFGFS 360 
361 REEIYQLNSD KIIBSNKVKA SL 

SEQ ED N&77 POOS DNA SEQUENCE 

70 Nucleic Add Accession r. AB028951 

Coding sequence: 97-1 128 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

nK I I I I I I 

ID GTTAAATOCT TACTTTACCA GATTCTTGAT GGTATCCATT AOCTOCATGC AAAZTGGGTG 60 

CTTCACAGAG ACTTGAAACC AG CAAAT ATC CTAGTAATGG GAGAAGGTCC TGAGAGGGGG 120 

AGAGTCAAAA TAGCTGACAT OGGTTTTGOC AGATTATTCA ATTCTCCTCT AAAGCCACTA 180 

GCAGATTTOG ATCCAGTAGT TCTGACAPTT TGGTATCGGG CTCCAG AACT TTTGCTTGGT 240 

0 GCAAGGCATT ATACAAAGGC CATTGATATA TGGGCAATAG GTTGTATATT TGCTGAATTG 300 

oO TTGACTTCGO AACCTATTTT TCACTGTCOT CAGGAAGAXA TAAAAACAAG CAATCCCTTT 360 
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CATCATGATC AACTGGATCG OATATTTAQT OTCATGCGCT TTCCTGCAGA TAAAGACTGG 420 
GAAGATATTA GAAAGAIGCC AGAATATCCC ACACTTCAAA AA6ACTTTA6 AAGAACAACG 460 
TATGCCAACA GTAGCCTCAT AAAGTACATO GAGAAACACA AGGTCAAGCC TGACAGCAAA 540 
GTGTTCCTCT TGCTTCAGAA ACTCCTGACC ATGGATCCAA OCAAGAGAAT TACCTCGGAO 600 
CAAGCTCTGC AGGATCCCTA TTTTCAGGAG GAOCCTTTGC CAACATTAGA TGTATTTGCC 660 
GGCTGCCAGA TTCCATACCC CAAACGAGAA TTCCTTAATG AAGATGATCC TGAAGAAAAA 720 
GGTGACAAGA ATCAGCAACA GCAGCAGAAC CAGCATCAGC AGCCCACAGC CCCTCCACAG 780 
CAGGCAGCAG CCCCTCCACA GGCGCCCCCA CCACAGCAGA ACAGCACCCA GACCAACGGG 840 
ACCGCAGGTG GGGCTGGGGC CGGGGTCGGO GGCACCGGAG CAGGGTTGCA GCACAGCCAG 900 
GACTCCAGCC TGAACCAGGT GCCTCCAAAC AAGAAGCCAC GGCTAGGGCC TTCAGGCGCA 960 
AACTCAGGTG GACCTGTGAT GCCCTCGGAT TATCAGCACT CCAGTTCTCG CCTGAATTAC 1020 
CAAAGCAGCG TTCAGGGATC CTCTCAGTCC CAGAGCACAC TTGGCTACTC TTCCTCGTCT 1080 
CAGCAGAGCT CACAGTACCA COCATCTCAC CAGGCCCACC GCTACTGAOC AGCTCCCGTT 1140 
GGGCCAGGCC AGCCCAGCCC AGAGCACAGG CTCCAGCAAT ATGTCTGCAT TGAAAAGAAC 1200 
CAAAAAAATG CAAACTATGA TGCCATTTAA AACTCATACA CATGGGAGGA AAACCTTATA 1260 
TACTGAG CAT TGTGCAGGAC TGATAGCTCT TCTTTATTGA CTTAAAGAAG ATTCTTGTGA 1320 
AGTTTCCCCA GCACCCCTTC CCTGCATGTG TTCCATTGTQ ACTTCTCTGA TAAAGCGTCT 1380 
GATCTAATCC CAGCACTTCT GTAACCTTCA GCATTTCTTT GAAGGATTTC CTGGTGCACC 1440 
1TTCTCATGC TGTAGCAATC ACTATGGTTT ATCTTTTCAA AGCTCTTTTA ATAGGATTTT 1500 
AATGTTTTAG AAACAGGATT CCAGTGGTGT ATAGTTTTAT ACTTCATGAA CTGATTTAGC 1560 
AACACAGGTA AAAATGCACC TTTTAAAGCA CTACGTTTTC ACAGACAATA ACTGTTCTGC 1620 
TCATGGAAGT CTTAAACAGA AACTGTTACT GTCOCAAAGT ACTTTACTAT TACGTTCGTA 1680 
TTTATCTAGT TTCAGGGAAG GTCTAATAAA AAGACAAGCG GTGGGACAGA GGGAACCTAC 1740 
AAOCAAAAAC TGCCTAGATC TTTGCAGTTA TGTGCTTTAT GCCACGAAGA ACTGAAGTAT 1800 
GTGGTAATTT TTATAGAATC ATTCATATGG AACTGAGTTC CCAGCATCAT CTTATTCTGA 1860 
ATAGCATTCA GTAATTAAGA ATTACAATTT TAACCTTCAT GTAGCTAAGT CTACCTTAAA 1920 
AAGGGTTTCA AGAGCTTTGT ACAGTCTCGA TGGCCCACAC CAAAACGCTG AAGAGAGTAA 1980 
CAACTGCACT AGGATTTCTG TAAGGAGTAA TTTTGATCAA AAGACGTGTT ACTTCCCTTT 2040 
GAAGGAAAAG TTTTTAGTGT GTATTGTACA TAAAOTCGGC TTCTCTAAAG AACCATTGGT 2100 
TTCTTCACAT CT GGGTCT GC GTGAGTAACT TTCTTGCATA ATCAAGGTTA CTCAAGTAGA 2160 
AGCCTGAAAA TTAATCTGCT TTTAAAATAA AGAGCAGTGT TCTCCATTCG TATTTGTATT 2220 
AGATATAGAG TGACTATTTT TAAAGCATGT TAAAAATTTA GGTTTTATTC ATGTTTAAAG 2280 
TATSTATTAT GTATGCATAA TTTTGCTGTT GTTACTGAAA CTTAATTCTA TCAAGAATCT 2340 
TTTTCATTSC ACTGAATGAT yj c -l'lTi-'GCX; CCTAGGAGAA AACTTAATAA TTGTGCCTAA 2400 
AAACTATGGG CGGATAGTAT AAGACTATAC TAGACAAAGT GAATATTTGC ATTTCCATTA 2460 
TCTATGAATT AGTGGCTGAG TTCTTTCTTA GCTGCTTTAA GGAGCCCCTC ACTCCCCAGA 2520 
GTCAAAAGGA AATGTAAAAA CTTAGAGCTC CCATTGTAAT GTAAGGGGCA AGAAATTTGT 2580 
GTTCTTCTGA ATGCTACTAG CAGCACCAGC CTTGTTTTAA ATGTTTTCTT GAGCTAGAAG 2640 
AAATAGCTGA 1TATTGTATA TGCAAATTAC ATGCATTTTT AAAAACTATT CTTTCTGAAC 2700 
TTATCTACCT GGTTATGATA CTGTGGGTCC ATACACAAGT AAAATAAGAT TAGACAGAAG 2760 
CCAGTATACA TTTTGCACTA TTGATGTGAT ACTOTAGCCA GCCAGGACCT TACTGATCTC 2820 
AGCATAATAA TGCTCACTAA TAATGAAGTC TGCATAGTGA CACTCATCAA GACTGAAGAT 2880 
GAAGCAGGTT ACGTGCTCCA TTGGAAGGAG TTTCTGATAG TCTCCTGCTG TTTTACCCCT 2940 
TCCATTTTTT AAAATAAGAA ATTAGCAGCC CTCTGCATAA TGTAGCTGCC TATATGCAGT 3000 
TTTATCCTGT. GCCCTAAAGC CTCACTGTCC AGAGCTGTTG GTCATCAGAT GCTTATTGCA 3060 
CCCTCACCAT GTGCCTCGTG CCCTGCTGGG TAGAGAACAC AGAGGACAGG GCATACTTCT 3120 
TGTCCTTAAG GAGCTTGTGA TCTGTGACAG TAAGOCCTCC TGGGATGTCT GTGCCATGTG 3180 
ATTGACTTAC AAGTGAAACT GTCTTATAAT ATGAAGGTCT TTTTGTTTAC TTCTAAACCC 3240 
ACTTGGGTAG TTACTATCCC CAAATCTGTT CTGTAAATAA TATTATGGAA GGGTTTCTAT 3300 
GTCAGTCTAC CTTAGAGAAA GCCAGTGATT CAATATCACA AAAGGCATTG ACGTATCTTT 3360 
GAAATGTTCA CAGCAGCCTT TTAACAACAA CTGGGTGGTC CTTGTAGGCA GAACATACTC 3420 
TOCTAAGTGG TTGTAGGAAA TTGCAAGGAA AATAGAAGGT CTGTTCTTGC TCTCAAGGAG 3480 
GTTACCTTTA ATAAAAGAAG ACAAACCGAG ATAGATATGT AAACCAAAAT ACTATGCCCC 3540 
TTAATACTTT ATAAGCAGCA TTGTTAAATA GTTCTTACGC TTATACATTC ACAGAACTAC 3600 
CCTGTTTTCC TTGTATATAA TGACTTTTGC TGGCAGAACT GAAATATAAA CTGTAAGGGG 3660 
ATTTCGTCAG TTGCTCCCAG TATACAATAT CCTCCAGGAC ATAGCCAGAA ATCTCCATTC 3720 
CACA CATGAC TGAGTTCCTA TCCCTGCACT GgT ACTGGC T CTTTTCTCCT Cl"lTOCTTGC 3780 
CTCAGGGTTC GTGCTACCCA CTGATTCCCT TTACCCTTAG TAATAATTTT GGATCATTTT 3840 
ClTlUri ' l - IA AAGGGGAACA AAGCCTTTTT TTTTTTTGAG ACGGAGTGTT GCTCTGTCAC 3900 
CCAAGCTGGA GTGCAGTGGC ACGATCTTGG CTCACTCCAA CCTCCACCTT CCAGGTTCAA 3960 
GTGATTCTCC TGCCTCAGCC TCCCGAGEAG CTGGGACTAC GGGGACGCAC CACCACGTC9 4020 
GGCTAATTTT TGTATTTTTA GTAGAGATGG GGTTTCACCC TATTGGTCAG GCTGGTCTTG 4080 
AATTCCTCAC CTCAGGTCAT CCGCCTGTCT CGGCCTCCCG AAGTGCTGGG ATTATAGGTG 4140 
TGAGCCACCG CACCCAGTTG GGAACAAAGC CTTTTTAACA CACGTAAGGG CCCTCAAACC 4200 
GTGGGACCTC TAAGGAGACC TTTGAAGCTT TTTGAGGGCA AACTTTACCT TTGTGGTCCC 4260 
CAAATGATGG CATTTCTCTT TGAAATTTAT TAGATACTGT TATGTCCCCC AAGGGTACAG 4320 
GAGGGGCATC CCTCAGCCTA TGGGAACACC CAAACTAGGA GGGGTTATTG ACAGGAAGGA 4380 
ATGAATCCAA GTGAAGGCTP TCTGCTCTTC GTGTTACAAA CCAGTTTCAG AGTTAGCTTT 4440 
CTGGGGAGGT GTGTGTTTGT GAAAGGAATT CAAGTGTTGC AGGACAGATG AGCTCAAGGT 4500 
AAGGTAGCTT TGGCAGCAGG GCTGATACTA TGAGGCTGAA ACAATOCTTO TGATGAAGTA 4560 
GATCATGCAG TGACATACAA AGACCAAGGA TTATGTATAT TTTTATATCT CTGTGGTTTT 4620 
GAAACTTTAG TACTTAGAAT TTTGGCCTTC TGCACTACTC T TTTO C T C TT ACGAACATAA 4680 
TG6ACTCTTA AQAATGGAAA GGGATGACAT TTACCTATGT GTGCTGCCTC ATTCCTGGTG 4740 
AAGCAACTGC TACTTGTTCT CTATGCCTCT AAAATGATGC TGTTTTCTCT GCTAAAGGTA 4800 
AAAGAAAAGA AAAAAAIAOT TGGAAAAXAA GACATGCAAC TTGATGTGCT TTTGAGTAAA 4860 
TTTATGCAGC AGAAACTATA CAATGAAGGA AGAATTCTAT GGAAATTACA AATCCAAAAC 4920 
TCTATGATGA TU l Vf T OC TA GGGAGTAGAG AAAGGCAGTG AAATGGCAGT TAGACCAACA 4980 
GAGGCTTGAA GGATTCAAGT ACAAGTAATA TTTTGTATAA AACATAGCAG TTTAGGTOCC 5040 
CATAATCCTC AAAAATAGTC ACAAATATAA CAAAGTTCAT TGTTTTAGGG TTTTTAAAAA 5100 
ACGTGTTGTA OCTAAGGCCA TACTTACTCT TCTATGCTAT CACTGCAAAG GGGTGATATG 5160 
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TATGTATTAT AIAAAAAAAA AAAOCCTTAA TOCACTGTTA TCTCCTAAAT ATTTAOTAAA 5220 

TTAATACTAT TTAATTTTTT TAAAGATTTQ TCTGTOTAflA CACTAAAAOT ATTACACAAA 5280 

ATCTGGACTG AACGTCTCCT TTTTAACAAC AATTTAAAGT ACTTTTTATA TATGTTATGT 5340 

AGTATATCCT TTCTAAACTQ CCTAGTTTGT ATATTCCTAT AATTCCTATT TGTGAAGTGT 5400 

ACCTGTTCTT CTCTCTTTTT TCAGTCATTT TCTGCACGCA TCCCCCTTTA TATGGTTATA 5460 

GAGATGACTG TAGCTTTTOQ TGCTCCACTQ COAGGTTTGT GCTCAGAGCC GCTQCACCCC 5520 

AGCGAGGCCT GCTCCATOGA GTGCAGGACG AGCTACT6CT TTGGAGOGAS GGTTTCCTGC 5560 

TTTTGAGTTG ACCTGACTTC CTTCTTGAAA TGACTGTTAA AACTAAAATA AATTACATTG 5640 
CATTTATTTT ATATTCTTGG TTQAAATAAA ATTTAATTGA CTTTQ 

SEQ (D NO:78 PDQ3 Protein sequence 
Protein Accession* BAA82980 

1 XI 21 31 41 51 

I I I 1 I I 

VKSLLYQILD GIHYLHANWV LHRDLKPANI LVHQEGPERO RVKIADMGFA RLFNSPLKPL 60 
ADLDPVWTF WYRAPELLLG ARHYTKAIDI WAIGCIFAEL LTSEPIFHCR QEDHCTSNPF 120 
HHDQLDRIFS VMGFPADKDW EDIRKMPEYP TLQKDFRRTT YANSSLIKYM EKHKVKPDSK 180 
VFLLLQKLLT MDPTKRITSE QALQDPYFQE DPLPTLDVFA GCQIPYFKRE FLNEDDPEEK 240 
GDKNQQQQQN QHQQPTAPPQ QAAAPPQAPP PQQNSTQTHG TAGGAGAGVG GTGAGLQHSQ 300 
DSSUJQVPPW KKPRLGPSQA NSGGPVMPSD YQHSSSRLNY QSSVQGSSQS QSTLGYSSSS 360 
QQSSQYHPSH QAHRY 

SEQ tt> NO:79 POOS DMA SEQUENCE 

Nucleic Add Accession #: XML002922 

Coding sequence 1-2190 (undented sequences correspond to start and stop coders) 

1 11 21 31 41 51 

I I I I I I 

ATGAATCCTT TCCAGAAAAA TGAGTCCAAQ QAAACTCTTT TTTCACCTGT CTGCATTGAA 60 

GAGGTACCAC CTCGACCACC TAGCCCTCCA AAGAAGCCAT CTCCGACAAT CTGTGGCTCC 120 

AACTATCCAC TGAGCATTGC CTTCATTGTG GTGAATGAAT TCTGCGAGCG CTTTTCCTAT 180 

TATGGAATGA AAGCTGTGCT GATCCTGTAT TTCCTQTATT TCCTGCACTO GAATGAAGAT 240 

ACCTCCACAT CTATATACCA TGCCTTCAGC AGCCTCTGTT ATTTTACTCC CATCCTGGGA 300 

GCAGCCATTG CTGACTCGTG GTTGGGAAAA TTCAAGACAA TCATCTATCT CTCCTTGGTG 360 

TATGTGCTTG GCCATGTGAT CAAGTCCTTQ GGTGCCTTAC CAATACTGGG AGGACAAGTQ 420 

GTACACACAG TCCTATCATT GATCGGOCTG AGTCTAATAG CTTTGGGGAC AGGAGGCATC 480 

AAACCCTGTG TGGCAGCTTT TGGTGGAGAC CAGTTTGAAG AAAAACATGC AGAGGAACGG 540 

ACTAGATACT TCTCAGTCTT CTACCTGTCC ATCAATGCAG GGAGCTTGAT TTCTACATTT 600 

ATCACACCCA TGCTGAGAGO AGATGTGCAA TOTTTTGGAG AAGACTGCTA TGCATTGGCT 660 

TTTGGAGTTC CAGGACTGCT CATGGTAATT GCACTTGTTG TQTTTGCAAT GGGAAGCAAA 720 

ATATACAATA AACCACCCCC TGAAGQAAAC ATAGTGGCTC AAGTTTTCAA ATGTATCTGG 780 

TTTGCTATTT CCAATCGTTT CAAGAACCGT TCTGGAGACA TTCCAAAGCG ACAGCACTGG 840 

CTAGACTGGG CAGCTGAGAA ATATCCAAAG CAGCTCATTA TGGATQTAAA GGCACTGACC 900 

AGGGTACTAT TCCTTTATAT CCCATTGCCC ATGTTCTGGG CTCTTTTGGA TCAGCAGGGT 960 

TCACGATGGA CTTTGCAAGC CATCAGGATG AATAGGAATT TGGGGTTTTT TGTGCTTCAG 1020 

CCGGACCAGA TGCAGGTTCT AAATCCCTTT CTGGTTCTTA TCTTCATCCC GTTGTTTGAC 1080 

TTTGTCATTT ATCGTCTGGT CTCCAAGTGT GGAATTAACT TCTCATCACT TAGGAAAATG 1140 

GCTGTTGGTA TGATCCTAGC GTGCCTGGCA TTTGCAGTTG CGGCAGCTGT AGAGATAAAA 1200 

ATAAA TGAAA TGGCCCCAGC CCAGTCAGGT CCCCAGGAGG TTTTCCTACA AGTCTTGAAT 1260 

CTGGCAGATG ATGAGGTGAA GGTGACAGTG GTGGGAAATG AAAACAATTC TCTGTTGATA 1320 

GAGTCCATCA AATC CTTTC A GAAAACACCA CACT ATTCCA AACTGCACCT GAAAACAAAA 1380 

AGCCAGGATT TTCACTTCCA CCTGAAATAT CACAATTTGT CTCTCTACAC TGAGCATTCT 1440 

GTGCAGGAGA AGAACTGGTA CAGTCTTGTC ATTCGTGAAG ATGGGAACAG TATCTCCAGC 1500 

ATGATOGTAA AGGATACAGA AAGCAAAACA ACCAATGGGA TGACAACCGT GAGGTTTGTT 1560 

AACACTTTGC ATAAAGATGT CAACATCTCC CTGAGTACAG ATACCTCTCT CAATGTTGGT 1620 

GAAGACTATG GTGTGTCTGC TTATAGAACT GTCCAAAGAQ GAGAATACCC TGCAGTGCAC 1680 

TGTAGAACAG AAGATAAGAA CTTTTCTCTG AATTTGGGTC TTCTAGACTT TGGTGCAGCA 1740 

TATCTGTTTG TTATTACTAA TAACACCAAT CAGGGTCTTC AGGCCTGGAA GATTGAAGAC 1800 

ATTCCAGCCA ACAAAATGTC CATTGCGTGG CAGCTACCAC AATATGCCCT GGTTACAGCT 1860 

GGGGAGGTCA TGTTCTCTGT CACAGGTCTT GAGTTTTCTT ATTCTCAGGC TCCCTCTAGC 1920 

ATGAAATCTG TGCTCCAGGC AGCTTGGCTA TTGACAATTG CAGTTGGGAA TATCATCGTG 1980 

CTTGTTGTGG CACAGTTCAG TGGCCTGGTA CAGTGGGCCG AATTCATTTT GTTTTCCTGC 2040 

CTCCTGCTGG TGATCTGCCT GATCTTCTCC ATCATGGGCT ACTACTATGT TCCTGTAAAG 2100 

ACAGAGGATA TGCGGGGTCC AGCAGATAAG CACATTCCTC ACATCCAGGG GAACATGATC 2160 
AAACTAGAGA CCAAGAAGAC AAAACTCTGA 

Pro^AcceSonft 5 ™*xpjk^ 

l 11 21 

I I i 

HNPFQKNESK ETLPSPVSIE EVPPRPPSPP 
YGMKAVLILY FLYPLHWNED TSTSIYBAPS 
YVLGBVZKSL GALPILGGQV VHTVLSLIGL 
TKYFSVFYLS IKAQSLISTF ITPHLRQDVQ 
IYNKPPPBGH rVAQVPKCIW FAISNRPKNR 
RVXfFbYZPLP HFHALLDQQQ SKWTLQAIKK 
FVTXRLVSKC GINFSSLRKM AVCiULACLA 
LADDEVKVTV VGNENNSLLI ESIKSFQKTP 
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KKPSPTICGS 


NYPLSIAFIV 


VNEFCERFSY 


60 


SLCYPTPILG 


AAIADSWLGK 


FKTIIYLSLV 


120 


SLIALOTGGI 


KPCVAAPGGD 


QFBEKHAEBR 


180 


CFGEDCYALA 


FGVPGLLHVI 


ALWPAKGSK 


240 


SGDIPKRQHW 


LDWAAEKYPK 


QLIMDVKALT 


300 


NRHLGFFVLQ 


PDQMQVLNPF 


LVLIFIPLFD 


360 


PAVAAAVEIK 


INEMAPAQSG 


PQEVFLQVLN 


420 


HYSKLHLKTK 


SQDPHPHLKY 


HNLSLYTEHS 
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VQBKNWYSLV IREDGNSISS MMVKDTESKT TOGMTTVRFV NTLHKDVNIS LSTDTSUJVG 540 
EDYGVSAYRT VQRGEYPAVH CRTEDKNPSL NLGLLDFGAA YLFVTTNNTN QGLQAWKIED 600 
IPAHKMSIAW QLPQYALVTA GEVMFSVTGL EFSYSQAPSS KKSVLQAAWL LTIAVGNIIV 660 
LWRQFSGLV QKABFILFSC LLLVICLIFS IHGYYYVPVK TEDHRGPADK KIPHIQGNKI 720 
KLETKKTKL 

SEOlDNOtfl PD06DNA SEQUENCE 

Nucleic Add Accession #: NMJJ20448 

1-1221 {undated sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGGACGGAT CCCACAGCGC AGCCCTGAAG CTCCAGCAGC TGCCTCCCAC AAGTAGCTCC 60 

AGCGCCGTAA GCGAGGCCTC CT T CTOCTAC AAGGAAAACC TGATTGGCGC CCTCTTQGCQ 120 

ATCTTOGGGC ACCTCOTGGT CAGCATTGCA CTTAACCTCC AGAAGTACTO CCACATCCOC 180 

CTGGCAGGCT CCAAGGATCC CCGGGCCTAT TTCAAGACCA AGACATGGTC GCTGGGCCTO 240 

TTOCTGATSC TTCTGGGCGA GCTGGGTGTQ TTCGCCTCCT ACGCCTTCGC GC CG C TSTCA 300 

CTC A TC O TGC CCCTCAGCOC AGTTTCTGTG. ATAGCTAGTG CCATCATAGQ AATCATATTC 360 

ATCAAGGAAA AGTGGAAACC GAAAGACTTT CTGAGGCGCT ACGTCTTGTC CTTTGTTGGC 420 

TSC6GTTT60 CTGTCGTGGG TACCTACCTG CTGGTGACAT TCGCACCCAA CAGTCACGAG 480 

AAGATGACAG GCGAGAATGT CACCAGGCAC CTCGTGAGCT QBOCTTTCCT TTTGTACATG 540 

CTGGTCGAGA TCA3TCTQTT CTGCTTGCTG CTCTACTTCT ACAA6GA6AA GAACGCCAAC 600 

AACATTGTCG TGATTCTTCT CTTGGTGGCG TTACTTGGCT CCATGACAGT GGTGACAGTC 660 

AAGGCCGTGG CTGGGATGCT TGTCTTCTOC ATTCAAGGGA ACCTGCAGCT TGACTACCCC 720 

ATCTTCTACG T6ATGTTCGT GTGCATGGTG GCAACCGCCO TCTATCAGGC TGCGTTTTTG 780 

AGTCAAGCCT CACAGATGTA CGACTCCTCT TTGATTQCCA QTGTGGGCTA CATTCTGTCC 840 

ACAACCATTG CTATCACAGC AGGTGCAATA TTTTACCTGG ACTTCATCGG GGAGGACGTG 900 

CTGCACATCT GCATGTTTGC ACTGGGGTGC CTCATTGCAT TCTTGGGCGT CTTCTTAATC 960 

ACGCGTAACA GGAAGAAGOC CATTCCATTT GAGCCCTATA TTTCCATGGA TGCCATGCCA 1020 

30 GGTATGCAGA ACATGCACGA TAAAGGOATG ACTGTCCAGC CTGAACTTAA AGCTTCTTTT 1080 

TCCTATGGGG CTCTGGAAAA CAATGACAAC ATTTCTGAGA TCTACGCTCC TGCCACCCTC 1140 

CCAGTCATGC AAGAAGAGCA CGGCTCCAGA AGTGCCTCTG GGGTCCCCTA CCGAGTCCTA 1200 
QAGCACACCA AGAAGGA ATQ A 

35 SEQ ID NO:82 PPQ6 m&\ SgWgnce, 
Protein Accession*: NP_065181 



I 11 21 31 41 51 

II I I I | 

MDGSHSAALK LQQLPPTSSS SAVSEASFSY KENLIGALLA IPQHLWSIA LNLQKYCHIR 60 

LAG SKD PRAY FKTKTWWLGL FLHLLGBLGV FASYAFAPLS LIVPLSAVSV IASAIIGIIF 120 

IKEKWKPKDF LRRYVLSFVG CGIAWGTYL LVTFAPNSHE KMTGENVTRH LVSWPFLLYM 180 

LVEIILFCLL LYFYKEKNAN NIWILLLVA LLGSMTWTV KAVAGMLVLS IQGNLQLDYP 240 

IFYVMFVCMV ATAVYQAAFL SQASQMYDSS LIASVGYILS TTIAITAGAI FYLDPIGEDV 300 

LHICMFALGC LIAFLGVFLI TRNRKKPIPF KPYISHDAMP GHQNMHDKGM TVQPBLKASF 360 
SYGALENNDN ISEIYAPATL PVKQEEHGSR SASGVFYRVL EHTKKE 



SEQ © NO:83 P008 DMA SEQUENCE 

cn fadelc Acid Accession* NMJB2712 

5U Coding sequence: 555-S08 (underlined sequences correspond to slat and slop codons) 

1 11 21 31 41 51 

I I I I I I 

c - CACTCATTAA GAACAGAGGA GGCTGCCTGT TACTCCTGGT GTTGCAICCC TCCAGACACT 60 

DD CTGCTGTTTC CTGCCTAGGC GTGGCTGCAG CCATGGCTAQ GAAAGCGCTG CCACCCACCC 120 

ACCTGGGCCA GAGCTGGTTC TGCTCCTGCT GCAGGGACAC TGAGCTGGCT ATCTCGGCGC 180 

TTCGGGCAAG AACTGCAACA GGCTCTCCTG GGTCCTGCAO GTGTACAGCC GGGCCCCTGC 240 

CTTGTGOCTC AGCTCTCGAG AGCTGCTGCT GCCGGGTGAC CTGATCCAAC CTGATA AGGT 300 

- A GCCATCTTCA GCTACCACTG CAAGGOCCTG AGGGCAACAG CAGCACGGCA CTGCCCAGCC 360 

OU GGCTGCTGAT GGCCTGGTGC CAGCTGGGAG TCCTCCCGGC ACTTCGAGGC CACTGAGCCA 420 

CCCTTCCAGC OCCAGCCCAC CATGGACAGG GGTATCCAGC TTCCTCCTCA ACCTCGTCCT 480 

CTGCCCCTGA GCCAGTGACG CCCAAGGACA TGCCTQTTAC CCAGGTCCTG TACCAGCACT 540 

AGCTGGTCAA GGGCATGACA GTGCTGGAGG CCGTCTTGGA GATCCAGGCC ATCACTGGCA 600 

£C GCAGGCTGCT CTCCATGGTG CCAGGGCCCG CCAGGOCACC AGGCTCATGC TCGCACCCAA 660 

Oj cccagtgcac aaggacttgg ctgctgagcc acacacccag GAGAAGGTOG ATAAGTGGGC 720 

TACCAAGGGC TTCCTGCAGG CTAGGGGAGG AGCCACCCCC GCTTOCCTAT TCTCACCAGG 780 

CCTATGGGGA GGAGCTGTCC ATACGCCACC GTGAGACCTQ GGCCTG GC TC TCAAGGACAG 840 

ACACCGCCTG GCCTGGTGCT CCAGGGGTGA AGCAGGCCAG AATCCTCGGG GAGCTGCTCC 900 

A TGGTTTGAGC TGCATTCAGG AAGTGCGGGA CATGGTAGGG GAGGCAAAAA GCCTTGGGCA 960 

/U CTACCCTCCC TCTCGAGCTG TTCGGTGTCC GTCGAGCTA6 O^CACCCTG ACACCATGTT 1020 

CAAGGGTACC GGAAGAGAAG GGTGTCTGCC CCCAACCTCC CCTCTGGGTG TCACTGGCCA 1080 

GATGTCATGA GGGAAGCAGG CCTTGTGAGT GGACACTGAC CATCAGTCCC TGGGGGGAGT 1140 

GATCCCCCAG GCATCGTCTG CCATGTTGCA CTTCTCCCCA GGCAGCAGGG TCGGTGGGTA 1200 

CCATGGGTCC CCACCOCTCC ACCACATGGG GCCCCAAAGC ACTGCAGGCC AAGCAGGGCA 1260 

7 J ACCCCACACC CTTGACATAA AAGCATCTTG AAGCTTTTAA AAAAAAAAAA AAAAAA 

SEQ 10 NOiBIPTOPt^^SW?^ 
Protein Accession r. np_i 16101 

80 1 11 21 31 41 51 
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MTVLEAVLEI QMTGSRLLS MVPQPARPPO SCWDPTQCTR TOLLSHTPRR EWISGLPRAS 60 
CRLGBBPPPL PYCBQAYGEE LSIRHRETWA WLSRTDTAWP GAPGVKQARI LCELLLV 

SEQ ID NO:65 PDT1 DNA SEQUENCE 

Nudete Acid Accession*: NMJW0693 

Coding sequence: 53-1591 (undented sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I I I I 

AGCCGGTGCG CCGCAGACTA GGGCGCCTCG GGCCAGGGAG CGCGGAGGAG CCATOGCCAC 60 

CGCTAACGGG GCCGTGGAAA ACGGGCAGCC GGACGGGAAG CCGCCGGCCC TGCCGCGCCC 120 

CATCCGCAAC CTGGAGGTCA AGTTCACCAA QATATTTATC AACAATGAAT GGCACGAATC 180 

CAAQAOTGGO AAAAAGTTTG CTACATGTAA CCCTTCAACT C666A0CAAA TATGTGAAGT 240 

GGAAGAAGGA GATAAGCCCG ACGTGGACAA GGCTGTGGAG GCTGCACAGG TTGCCTTCCA 300 

6AGG6GCTCG CCATGGCGCC GGCTGGATGC CCTGAGTCGT GGGCGGCTGC TGCACCAGCT 360 

GGCT6ACCTG GTGGAGAGG6 ACCGCG CCAC CTTGO CCGCC CTGGAGACGA TGGATACAGG 420 

GAAGCCATTT CTTCATGCTT TTTTCATCGA CCTGGAGGGC TGTATTAGAA CCCTCAGATA 480 

CTTTGCAGGG TGGGCAGACA AAATCCAGGG CAAGACCATC CCCACAGATG ACAACGTCGT 540 

ATGCTTCACC AGGCATGAGC CCATTGGTGT CTGTGGGGCC ATCACTCCAT GGAACTTCCC 600 

CCTGCTGATQ CTGGTGTGGA AGCTGGCACC CQCCCT CTGC TGTGGGAACA CCATGGTCCT 660 

QAAGCCTGCQ GAGCAGACAC CTCTCACCGC CCTTTATCTC GGCTCTCTGA TCAAAGAGGC 720 

GGGGTTCCCT CCAGGAGTGG TGAACATTGT GCCAGGATTC GGGCCCACAS TGGGAGCAGC 780 

AATTTCTTCT CACCCTCAGA TCAACAAGAT CGCCTTCACC GGCTCCACAG AGGTTGGAAA 840 

ACTGGTTAAA GAAGCTGCGT CCCGOAGCAA TCTGAAGCGG GTGA0GCT60 AGCTGGGGGG 900 

GAAGAACCCC TGCATCGTGT GTGCGGACGC TGACTTGGAC TTGGCAGTGG AGTGTGCCCA 960 

TCAGGGAGTG TTCTTCAACC AAGGCCAGTG TTGCACGGCA GCCTCCAGGG TGTTCGTGGA 1020 

GGAGCAGGTC TACTCTGAGT TTGTCAGGCG GAGCGTGGAG TATGCCAAGA AACGGCCCGT 1080 

GGCAGACCCC TTCGATGTCA AAACAGAACA GGGGCCTCAG ATTGATCAAA AGCAOTTCGA 1140 

CAAAATCTTA GAGCTGATCG AGAGTGGGAA GAAGGAAGGG GCCAAGCTGG AATGCGGGGG 1200 

CTCAGCCATG GAAGACAAGG GGCTCTTCAT CAAACCCACT GTCTTCTCAG AAGTCACAGA 1260 

CAACATGCGG ATTGCCAAAG AGGAGATTTT CGGGCCAGTG CAACCAATAC TGAAGTTCAA 1320 

AAGTATCGAA GAAGTGATAA AAAGAGCGAA TAGCACCGAC TATGGACTCA CAGCAGCCGT 1380 

GTTCACAAAA AATCTCGACA AAGCCCTGAA GTTGGCTTCT GCCTTAGAGT CTGGAACGGT 1440 

CTGGATCAAC TGCTACAACG CCCTCTATGC ACAGGCTCCA TTTGGTGGCT TEAAAATGTC 1500 

AGGAAATGGC AGAGAACTAG GTGAATACGC TTTGGCCGAA TACACAGAAG TGAAAACTGT 1560 

CACCATCAAA CTTGGCGACA AGAACCCC TG AA GGAAAGGC G GG QCTCCTT CCTCAAACAT 1620 

CGGACGGCG6 AATGTGGCAG ATGAAATGTQ CTGGAGGAAA AAAATGACAT TTCTGACCTT 1680 

CCCGGGACAC ATTCTTCTGG AGGCTTTACA TCTACTGGAG TTGAATGATT GC TG mrilC C 1740 

TCTCACTCTC CTGTTTATTC ACCAGACTGG GGATGCCTAT AGGTTGTCTG TGAAATCGCA 1800 

GTCCT GC C T G GGGAGGGAGC TGTTGGCCAT T T C TOTOTTT CCCTTTAAAC CAGATCCTGG 1860 

AGACAGTGAG ATACTCAGGG CGTTGTTAAC AGGGAGTGGT ATTTGAAGTG TCCAGCAGTT 1920 

GCTTGAAATG CTTTGCCGAA TCTGACTCCA GTAAGAATGT GGGAAAACCC CCTGTGTGTT 1980 

CTGCAAGCAG GGCTCTTGCA CCAGCGGTCT CCTCAGGGTG GACCTGCTTA CAGAGCAAGC 2040 

CACGCCTCTT TCCGAGGTGA AGGTGGGACC ATTCCTTGGG AAAGGATTCA CAGTAAGGTT 2100 

TTTTCOTTTT TGTTTTTTGT TTTCTTGTTT TTAAAAAAAG GATTTCACAG TGAGAAAGTT 2160 

TTGGTTAGTO CATACCGTGG AAGGGCGCCA GGGTCT TTGT GGATTGCATG TTGACATTGA 2220 

CCGTGAGATT CGGCTTCAAA CCAATACTGC CTTTGGAATA TGACAGAATC AATAGCCCAG 2280 

AGAGCTTAGT CAAAGACGAT ATCACGGTCT ACCTTAACCA AGGCACTXTC TTAAGCAGAA 2340 

AATATTGTTO AGGTTACCTT TGCTGCTAAA GATCCAATCT TCTAACGCCA CAACAGCATA 2400 

GCAAATCCTA GGATAATTCA CCTCCTCATT TGACAAATCA GAGCTGTAAT TCACTTTAAC 2460 

AAATTACGCA TTTCTA TCAC GTTCACTAAC AGCTTATGAT AAGTCTGTGT ft O TC TT CC T T 2520 

TTCTCCAGTT • CTGTTACCCA ATTTAGATTA GTAAAGCGTA CACAACTGGA AAGACTGCTG 2580 

TAATAACACA GCCTTGTTAT TTTTAAGTCC TATTTTGATA TTAATTTCTG ATTAGTTAGT 2640 

AAATAACACC TGGATTCTAT GGAGGACCTC GGTCTTCATC CAAGTGGCCT GAGTAOTTCA 2700 

CTGGCAGGTT GTGAATTTTT CTTTTCCTCT TTGGGAATCC AAATGATGAT GTGCAATTTC 2760 

ATGTTTTAAC TTGGGAAACT GAAAGTGTTC CCAIATAGCT TCAAAAACAA AAACAAATGT 2820 

GTTATCCGAC GGATACTTTT ATGGTTACTA ACTAGTACTT TCCTAATTGG GAAAGTAGTG 2880 

CTTAAGTTTG CAAATTAAGT TGGGGAGGGC AATAATAAAA TGAGGGCCCG TAACAGAACC 2940 

AGTGTGTGTA TAACGAAAAC CATGTATAAA ATGGGCCTAT CACCCTTGTC AGAGATATAA 3000 

ATTACCACAT TTGGCTTCCC TTCATCAGCT AACACTTATC ACTTATACTA OCAATAACTT 3060 

GTTAAATCAG GATTTGGCTT CATACACTGA ATTTTCAGTA TTTTATCTCA AGTAGATATA 3120 

GACACTAACC TTGATAGTGA TACGTTAGAG GGTTCCTATT CTTCCATTGT ACGATAATGT 3180 

CTTTAATATG AAATGCTACA TTATTTATAA TTGGTAGAGT TATTGTATCT TTTTATAGTT 3240 

GTAAGTACAC AGAGGTGGTA TATTTAAACT TCTGTAATAT ACTGTATTTA GAAATGGAAA 3300 

TATATATAGT GTTAGGTTTC ACTTCTTTTA AGGTTTACCC CTGTGGTGTG GTTTAAAAAT 3360 

CTATAGGCCT GGGAATTCCG ATCCTAGCTG CAGATCGCAT CCCACAATGC GAGAATGAIA 3420 
AAATAAAATT GGATATTTGA GA 

SEP ID NO;86 PDT1 PROTEIN SEQUENCE 

Pratein Accession I: NPJ0OO684 

1 11 21 31 41 51 

I I I I I I 

MATANGAVEN GQFDGKPPAL PRPIRNLEVK FTKIFINNEW HBSKSGKKFA TCNPSTREQI 60 

CEVEEGDKPD VDKAVEAAQV AFQRGSPWRR LDRLSRGRLL HQLADLVERD RATLAALETM 120 

DTGKPFLHAF FTDLBGCIR2 LRY?AGWADK IQCJKTIPTDD NWCFTHHBP IGVCGAITPW 180 

JJFPLLMLVWK LAPALCCGOT MVLKPAEQTP LTALYLGSLI KEAGFPPGW NTVFGFGPTV 240 

GAAISSHPQI NKIAFTGSTB VGKLVKEAAS RSNLKRVTLE LGGKNPCXVC ADADLDLAVB 300 
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30 
35 
40 
45 
50 



60 



CAHQGVFFNQ GQCCTAASRV FVEEQWSEF VRRSVBYAKK RPVGDPPDVK TEQGPQIDQK 360 

QFUJU.LELIB SGKKBGAKLB CGGSAMEDKG LFIKPTVFSE VTDNKRIAKE EIFGPVQPIL 420 

KFKSIEEVIK RANSTDYGLT AAVFTKNLDK ALKLASALES GTVWINCYNA LYAQAPFGGP 480 
KMSGNGRELQ EYALAEYTEV KTVTIKLGDK NP 



SEQ P KM8 PDV3 Protein seouenca 
Protein Accession i: NP_l 1603 1 



SEQ ID N0*7 P0V3 DNA SEQUENCE 

Nucleic Add Accession r. NMJB2642 

Coding sequence: 1 84-1263 (undBrtned sequences correspond to start and stop codons) 

10 1 11 21 31 41 51 

I I I I I I 

GACCATTAGC AGGCACCCAG GCCTGTCTTT GGCTCGGAAA CG G TG G CC C C CAATGTAGCC 60 

TAGTTTQAAC CTAGGAACTG CAGGACCAGA GAGATTCCAC TGGAGCCTGA TGGACGGGTG 120 

- ACAGAGGGAA CCCTACTCTG GAAACTGTCA GTCCCAGGGC ACTGGGGAGG GCTGAGGCCG 180 

15 ACCATGCCCA GCCTGCTGCT GCTGTTCACQ GCTGCTCTGC TGTCCAGCTG GGCTCAGCTT 240 

CTQACAGACO CCAACTCCTG GTGGTCATTA GCTTTGAACC CGGTGCAGAG AOOOGAGATG 300 

TTTATCATCG GTGCCCAGCC CGTGTGCAGT CAGCTTOCOG GGCTCTCCCC TGGCCAGAGG 360 

AAGCTGTGCC AATTGTACCA GGAGCACAT3 GCCTACATAG GGGAGGGAGC CAAGACTGGC 420 

A ATCAAGGAAT GCCAGCACCA GTTCCGGCAG CGGCGGTGGA ATTGCAGCAC AGCGGACAAC 480 

2AJ gcatctgtct TTGGGAGAGT CATGCAGATA GGCAGCCGAG AGACCGCCTT CACCCACGCG 540 

GTGAGCGCCG CGGGCGTGGT CAACGCCATC AGCCGGGCCT GCCGCGAGGG CGAGCTCTCC 600 

ACCTGCGGCT GCAGCCGGAC GGOGOGGCCC AAGGAOCTGC CCCGGGACTG GCTGTGGGGC 660 

GGCTGTGGGG ACAACGTGGA GTACGGCTAC CGCTTCGCCA AGGAGTTTGT GGATGCCCGG 720 

GAGCGAGAGA AGAACTTTGC CAAAGGATCA GAGGAGCAGG GCCGGGTGCT CATGAACCTG 780 

25 CAAAACAACG AGGCCGGTCG CAGGGCTGTG TATAAGATGG CAGACGTAGC CTGCAAATGC 840 

CACGGCGTCT CGGGGTCCTG CAGCCTCAAG ACCTGCTGGC TGCAGCTGGC CGAGTTCCGC 900 

AAGGTCGGGG ACCGGCTGAA GGAGAAGTAC GACAGCGCGG CCGCCATGCG CGTCACCCGC 960 

AAGGGCCGGC TGGAGCTGGT CAACAGCCGC TTCACCCAGC CCACCCCGGA GGACCTGGTC 1020 

TATGTGGACC CCAGCCCOGA CTACTGCCTG CGCAAOGAGA GCACGGGCTC CCTGGGCACG 1080 

CAGGGCCGCC TCTGCAACAA GACCTCGGAG GGCATGGATG GCTGTGAGCT CATGT GCT GC 1140 

GGGCGTGGCT ACAAC CAGTT CAAGAGCGTG CAGGTGGAGC GCTGCCACTG CAAGTTCCAC 1200 

TGGTGCTGCT TCGTCAGGTG TAAGAAGTGC ACGGAGATCG TGGACCAGTA CATCTGTAAA 1260 

TAGCCCGGAG GGCCTGCTCC CGGCCCCCCC TGCACTCTGC CTCACAAAGG TCTATATTAT 1320 

ATAAATCTAT ATAAATCTAT TTTATATTTG TATAAGTAAA TGGGTGGGTG CTATACAATG 1380 

GAAAGATGAA AATGGAAAGG AAGAGCTTAT TTAAGAGACG CTGGAGATCT CTGAGGAQTG 1440 

GACTTTGCTG GTTCTCTCCT CTTGGTGGGT GGGAGACAGG OCT Tm C T C TCC C TCTGOC 1500 

GAGGACTCTC AGGATGTAGG GACTTGGAAA TATTTACTGT CTGTCCACCA CGGCCTGGAG 1560 

GAGGGAGGTT GTGGTTGGAT GGAGGAGATG ATCTTGTCTG GAAGTCTAGA GTCTTTGTTG 1620 

GTTAGAGGAC TGCCTGTGAT CCTGGCCACT AGGCCAAQAG GCCCTATGAA GGTGGCGGGA 1680 

ACTCAGCTTC AACCTCGATG TCTTCAGGGT CTTGTCCAGA ATGTAGATGG GTTCCGTAAG 1740 

AGGCCTGGTG CTCTCTTACT CTTTCATCCA CGTGCACTTG TGCGGCATCT GCAGTTTACA 1800 

GGAACGGCTC CTTCCCTAAA ATGAGAAGTC CAAGGTCATC TCTGGCCCAG TGACCACAGA 1860 

GAGATCTGCA CCTCCCGGAC TTCAGGCCTG CCTTTCCAGC GAGAATTCTT CATCCTCCAC 1920 

GGTTCACTAG CTCCTACCTG AAGAGQAAAO GGGGCCATTT GACCTGACAT GTCAGGAAAG 1980 

CCCTAAACTG AATGTTTGCG CCTGGGCTGC AGAAGCCAGG GTGCATGACC AGGCTGCGTG 2040 

GACGTTATAC TGTCTTCCCC CACCCCCGGG GAGGGGAAGC TTGAGCTGCT GCTGTCACTC 2100 

CTCCACCGAG GGAGGCCTCA CAAACCACAG GACGCTGCAA CGGGTCAGGC TGGCGGGCCC 2160 

GGCGTGCTCA TCATCTCTGC CCCAGGTGTA CGOTTTCTCT CTGACATTAA ATGCCCTTCA 2220 
TGGAAAAAAA AAAAAGAAAA AAAAAAAAAA AA 



1 11 21 31 41 51 

55 | S I I I I 

HPSLLLLFTA ALLSSWAQLL TOANSWWSIA LNWQRPEMF IIGAQPVCSQ LPGLSPGQRK 60 
LCQLYQEHKA YIGEGAKTGI KBCQHQFRQR RWKCSTADNA SVPGRWMQIG SKETAFTHAV 120 
SAAGWKAIS RACREGELST CGCSRTARPK DLPRDWLWGG CQDNVEYGYR FAKEFVDARE 180 
REKNFAKGSB EQGRVLHNLQ NNEAGRRAVY KMADVACKCH GVSGSCSLKT CWLQLAEFRK 240 
VGDRLKEKYD SAAAMKVTRK GRLELVNSRF TQPTPKDLVY VDPSPDYCLR NESTGSLGTQ 300 
GRLCHKTSEG HDGCELMCCG RGYNQFKSVQ VERCHCKFHW CCFVRCKKCT EIVDQY1CK* 



SEQ ID NO:89 PDT9 DNA SEQUENCE 

Nucleic Acid Accession ft NMJD33280 
65 Coding sequence: 56-636 (underitned sequances correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGCAGCCGTC TGTGCCACCC AGAGCCGGCG GGCCGCTAGG TCCCCGGAGA CCCTGCTATQ 60 

/U GTGCGTGCGG GCGCCGT6GG GGCTCATCTC CCCGCGTCCG GCTTGGATAT CTTCGGGGAC 120 

CTGAAGAAGA TGAACAAGCG CCAGCTCTAT TACCAGGTTT TAAACTTOGC CATGATCGTG 180 

TCTTCTGCAC TCATGATATG GAAAGGCTTG ATCGTGCTCA CAGGCAGTGA GAGCCCCATC 240 

GTGGTGGTGC TGAGTGGCAG TATGGAGCCG GCCTTTCACA GAGGAGACCT CCTQTTCCTt : 300 

„ ACAAATTTCC GGGAAGACCC AATCAGAGCT GGTGAAATAG TTGTTTTTAA AGTTGAAGGA 360 

75 CGAGACATTC CAATAGTTCA CAGAQTAATC AAAGTTCATG AAAAAGATAA TGGAGACATC 420 

AAATTTCTGA CTAAAGGAGA TAATAATGAA GTTGATGATA GAGGCTTGTA CAAAGAAGGC 480 

CAGAACTGGC TGGAAAAGAA GGACGTGGTG GGAAGAGCAA GAGGGTTTTT ACCATATGTT 540 

GGTATGGTCA CCATAATAAT GAATGACTAT CCAAAATTCA AGTATGCTCT TTTGGCTGTA 600 

OA ATGGGTGCAT ATGTGTTACT AAAACGTGAA TCCTAAAATG AGAAGCAGTT CCTGGGAOCA 660 

OU GATTGAAATG AATTCTGTTG AAAAAGAGAA AAACTAATAT ATTTGAGATG TTCCATTTTC 720 
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TGTATAAAAG GGAACAGTGT GGAGATGTTT TTQTCTTGTC CAAATAAAAG ATTCACCAGT 760 
AAAAAAAAAA AAA* 

SEQDNO30 POT9 Protein sequence 
Pra^ Accession* NP.150596 

1 11 21 31 41 51 

I I I I I I 

HVRAGAVGAH LPASGLDIFG DLKKMNKRQL YYQVLNFAMI VSSALMIWKG LIVLTGSESP 60 

IWVLSGSME PAFHRGDLLP LTNFREDPIR AGEIWFKVB GRDIPIVHRV HCVHEKDNGD 120 

IKFLTKGDNN EVDDRGLYKE GQNWLEKKEV VGRARGFLPY VGMVTIIMND YPKFKYALLA 160 
VMGAYVLLKR ES 

S£QDN0.91 PDY5DNA SEQUENCE 

Nucleic Add Accession #: NMJH6590 

Owfing sequencer 691-675 (uraierftied stances <x^ 

1 U 21 31 41 51 

I .1 I I 1 I 

GATTACTCAC ACAGTCTTGA AGATGCAATC TCAGCTATTT AGGACAGAAA CATCCAAGGC 60 

CGTGTCAGAA CTCAATIACO ACTACA1ATG CATTAAGGCA GGAACT6GCA GGCCTCAGGG 120 

TAOGCCAACT ATAGGACTCG TGCTTCTCGT ACGCTGGOCT ATAATCTATQ AAACTGAGCT 180 

CCAGAGCCAG CCAATCACTT AGCTCCTCAT AACAAGTCTA ACTGGCTCTG GAAAGCTGAA 240 

AGGGCTGCAC TGGAACAACA CAGATGAGAT ATTCTACACA TTAATCTACT TATCTGGAAT 300 

CACTTTGCCT CTAAAGGCCA GAGAAAAATC ACAGCTTCCT TGTCGGAGGG GAAAAGGACA 360 

GGTGATCTGG GGAAAACGCA GCTACACCTG GAGCAAGGTC TCTTCCCGGC TTGGCAATCT 420 

CAGCTGTGCC GGCGCTACGG GACCCGAGCC GTCCCAGAAA CCAAAGGGCA GGCACGGCAG 480 

CA AACGCCTQ AGTGCTGCT6 CCTTCGGTGA CTATATGAGA ATGGAAAC7T CTAAGGAAGC 560 

CAGGTTGTTA GAATTGTTAC CCCCTTTACT CAGAGATAAC ATAGATTATC CAGGCTGAGA 600 

TGGAAAACAA GCCCTTTATT GAATTTTCAA CACAGACTGC CTGCTTCTCA TCTCCTTAAT 660 

AAAATTTCAT TAAAATCCCC TTGAACTCCC ATGTTCAAAT CTCCATTTGT TGACAGACAA 720 

AGCCAACAAT ACTCTAAACT GAGGCCTGCA AGTCATTTCA TTTGTATTTT TGTCCAGAAA 780 

TTTCCCATAG GAAGACTTCA CCTCCTACAA CTCCGAAGAA AACCCTTACT GTCCAAGAOC 840 

GTCACCAGCA ACCATCCGCA GTCATTCAAG TGGAAGCTTT CACAGCTTTT GTACATTCTC 900 

TGTGTCAATA TACAACTGAG TTACAGACTG TCCCCTGGCT CCCTGACCCT TACAAACACT 960 

AAAAGTTTTG TTPGACTCAA CTTCAAGCTG CTCATCTGTT AGTAAGTGAT GTTCACTCCA 1020 

GAACACATTC ATGATGAGAA CTTTCTAAAA GACCAGCACT GCTCTTCCCC TCCTATAATC 1080 

ATAATAATCA TGATAACCTG AAACATGTTA CTGGGACTCG ACATTTTTCT GGGGATTQAA 1140 

ATCTTTAGTC CTTGGAGCTG TCACASAGCA GGGGCAACCT CACACTGAAA CAAAGGAAGT 1200 

GATCTCCCAT TA1TATCCAC CCTGAGCCAC CATAATATGC TGTTTACATT TATTTTCTTC 1260 

AGGCTGTGCA AAACAAAGCA ATGGAAAAGQ AAACTAAAAA ATATACATAC TAGTACCATT 1320 

ATCTTCTTTT GCCTAAAATT ACTAATGCAC CACOTCAGTC T Q CT T OCT T C AGGCATCATT 1380 

CTCAATTCAT CAGGACTTGT ATTAGCAGGT TCTGGCTAGA GAGACTATCT CCTGTCATCA 1440 

CGATCAATTA ATOTTTTCTG GTGATCACAT CAGGCCCTAT CTAAGAAGCT CATGGTAXAC 1500 

AAGGGTCACC CAAATAGCTG AGTGCAGTCC TTGCTCATAT TTCCTTCATC TTAACCCCGC 1560 

AAACAAGAAT TAAGATGATC CCAATAAAAG AAAAATTGCT CAGGAAACTG AACCTTTTTC 1620 

TGAACCAAGC ACTGTCAGCA AATCTCAGGT ATTAGAGCAA CTATGGTTGA TTGAAAAGTG 1680 

TCTCAAAATC TGGGCCAAGA ATGATTGCTA GGTCCATAAG CTAATTTGTC TGGCCTTGCC 1740 

ATTTACGTAA GCCAAAGAAA GTCACTCATG AGTAAACTAT AGAAAACGTT CAGACCCATC 1800 

CTGTTAGTAT GTCAAATCAA CTAAGACTGG GAGGGTATTA ACTCCATTCC AGGTGACATO 1860 

GAXAAAGAGC CCCATTATTT TCACAGTGCC AGCCTCTACC TAAGGAAACC CTAGACCTTG 1920 

GAACCAGTTT OCTGGTAGGG AACTGCTGAC AGTTTCAATG CTGACAGTTG GAGOCAATGC 1960 

CTCATAGTGT AAACTGAAAG AAAAATAGTT GCTTTTTAAA ATGTCAGCAA GAAGGCCTGC 2040 

CTCATCTTAA CAAAGCAAAA AAAAATGCTT TAATTCAAAT TAAAAATCAT GAIACTAAAA 2100 
AAAAAAAA 

SEQ P HOS2 PDVgPiPtehraouenca 
Protein Accession f. NPJW7674 

1 11 21 31 41 51 

I I I I I I 

MQCQLFRTET SKAVSELNYD YICIKAGTGR PQGTPTIGLV LLVRWAHYB TELQSQPIT 

SEQ ID N033 PEEB DNA SEQUENCE 

Nucleic Add Accession*: NMJD02606 

Coding sequence: 61-1842 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CGCGGCGGCT GGCGTCGGGA AAGTACAGTA AAAAGTCCGA GTGCAGCCGC CGGGCGCAGG 60 

ATGGGATCCG GCTCCTCCAG CTACCGGCCC AAGGCCATCT ACCTGGACAT CGATGGACGC 120 

ATTCAGAAGG TAATCTTCAG CAAGTACTGC AACTCCAGOG ACATCATGGA CCTGTTCTGC 180 

ATCGCCACCG GCCTGCCTCG GAACACGACC ATCTCCCTGC TGACCACCGA CGACGCCATG 240 

GTCTCCATCG ACCCCACCAT GCCCGCGAAT TCAGAACGCA CTCCGTACAA AGTGAGACCT 300 

GTGGCGATCA AGCAACTCTC CGCTGGTGTC GAGGACAAGA GAACCACAAG CCGTGGCCAG 360 

TCTGCTGAGA GACCACTGAG GGACAGACGG GriXS'l'G GGC C TCGAGCAGCC CCGGAGGGAA 420 

GGAGCATTTG AAAGTGGACA GGTAGAGCCC AGGCCGAGAG AGCCOCAGGG GTGCXAOCAO 480 

GAAGGCCAGC GCATCCCTCC AGAGAGAGAA GAATTAATCC AGAGCGTGCT GGCGCAGGTT 540 

GCAGAGCAGT TCTCAAGAGC ATTCAAAATC AATGAACTGA AAGCTGAAGT TGCAAATCAC 600 

TTGGCTGTOC TAGAGAAACG CGTGGAATTG GAAGGACTAA AAGTGGTGGA GATTGAGAAA' 660 
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TGCAAGAGTO ACATTAAGAA GATGA GGGAG GAGCTGGCGG CCAGAAGCAG CAGGACCAAC 720 

TGOOCCTGTA AGTACAGTTT TTTGGATAAC CACAAGAAGT TGACTCCTCG ACGCGATGTT 780 

CCCACTTACC CCAAGTACCT GCTCTCTCCA GAGACCATCG AGGOCCTGCG GAAGCCGAOC 840 

TTTGACGTCT GGCTTTGGGA GCOCAATGAG A3GCTGAGCT GCCTQGAGCA CATGTACCAC 900 

GACCTCGGGC TGGTCAGGGA CTTCAGCATC AACC CTGTCA CCCTCAGGAG GTGGCTGTTC 960 

TGTGTCCACG ACAACTACAG AAACAACCOC TTCCACAACT TCCGGCACTG CTTCTGCGTG 1020 

GQCCAGATGA TGTACAGCAT GGTCTGGCTC TGCAGTCTCC AGGAGAAGTT CTCACAAACG 1080 

GATATCCTGA TCCTAATGAC AGCGGCCATC TGCCAOGATC TGGACCATOC CGGCTACAAC 1140 

AACACGTACC AGATCAATGC CCGCACAGAG CT G GC GGTCC GCTACAATGA CATCTCACCG 1200 

CTGGAGAAOC ACCACTGCGC CGTGGCCTTC CAGATCCTCG CCGAGCCTGA QTGCAACATC 1260 

TTCTCCAACA TCCCACCTGA TGGGTTCAAG CAGATOCGAC AGGGAATGAT CACATTAATC 1320 

TTGGCCACTG ACATGGCAAG ACATGCAGAA ATTATGGATT CTTTCAAAGA GAAAATGGAG 1380 

AATTTTGACT ACAGCAACGA GGAGCACATG ACOCTGCTGA AGATGATTTT GATAAAATGC 1440 

TGTGATATCT CTAACGAGGT CCGTCCAATG GAAGTCGCAG AGCCTTGGGT GGACTGTTTA 1500 

TTAGAGGAAT ATTTTATGCA GAGCGAOOGT GAGAAGTCAG AAGGCCTTCC TGTGGCACCG 1560 

TTCATGGACC GAGACAAAGT GACCAAGGCC ACAGOCCAGA TTGGGTTCAT CAAGTTTGTC 1620 

CTGATCCCAA TGTTTGAAAC AGTGACCAAG CTCTTCCCCA TGGTTGAGGA GATCATGCTG 1680 

CAGCCACTTT GGGAATOCCG AGATCGCTAC GAGGAGCTGA AGCGGATAGA TGACGCCATG 1740 

AAAGAGTTAC AGAAGAAGAC TGACAGCTTG AOGTCTGGGG CCACCGAGAA GTCCAGAGAG 1600 

AGAAGCAGAG ATGTGAAAAA CAGTGAAGGA GACTGTGCCT GAGGAA AGCG GGGGGCGTGG 1860 

CTGCAGTTCT GGACGGGCTC GCCGAGCTGC GCGGGATCCT TGTGCAGGGA AGAGCTGCCC 1920 

TGGGCACCTG GCACCACAAG ACCATGTTTT CTAAGAACCA TTTTGTTCAC TGATACAAAA 1980 
AAAAAAAAAA A 



SEQPHo^regg Proton sequence. 

Protein Accession ft NPJ0O2597 



1 


11 


21 • 
1 


31 


41 


51 




1 

KGSGSSSYRP 


1 

KAIYLDZDGR 


IQKVTFSKYC 


1 

NSSDZMDLFC 


1 

IATGLPRNTT 


1 

ISLLTTDDAM 


60 


VSIDPTMPAN 


SERTPYKVRP 


VAIKQLSAGV 


EDKRTTSRGQ 


SAERPLRDRR 


WGLEQPRRE 


120 


GAFESGQVEP 


RPREPQGCYQ 


EGQRIPFERE 


ELIQSVLAQV 


AEQPSRAFKI 


NELKAEVANH 


180 


LAVLBKKVEL 


EGLKWEIEK 


CKSDIKKMRE 


ELAARSSRIN 


CPCKVSFLDN 


HKKLTPRRDV 


240 


PTYPKYLLSP 


ETIEALRKPT 


FDVWIA7BPNE 


MLSCLEHMYH 


DLGLVRDPSI 


NPVTLHRWLP 


300 


CVHDNYRNNP 


FHNFRHCFCV 


AQMHYSMVWL 


CSLQEKFSQT 


DILIUTTAAI 


CHDLDHPGYN 


360 


NTYQINARTE 


LAVRYNDISP 


LENHHCAVAF 


QILAEPECNI 


FSNIPPDGPK 


QIRQGMITLI 


420 


LATDKARHAE 


IHDSFKBKME 


NFDYSNKEHM 


TLLKMILIKC 


CDISNEVRPM 


EVAEPWVDCL 


480 


LEEYFHQSDR 


EKSEGLPVAP 


FMDRDKVTKA 


TAQIGPIKFV 


LIPKFBTVTK 


LPPHVEEIML 


540 


QPLWESRDRY 


EELKR1DDAM 


KELQKKTDSL 


TSGATEKSRB 


RSRDVKNSEG 


DCA 





SEQ ID N0S5 PEG4 DNA SEQUENCE 

Nuddc Add Accession #: none 

Coding sequence: 41 -559 (undefined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CAGTCACAGG CGAGAGCCYT GGGATGCACC GGCCAGAGGC ATGCTGCTGC TGCTCACGCT 60 

TGCCCTCCTG GGGGGCCCCA CCTGGGCAGG GAAGATGTAT GGCCCTGGAG GAGGCAAGTA 120 

TTTCAGCACC ACTGAAGACT ACGACCATGA AATCACAGGG CTGCGGGTGT CTGTAGGTCT 180 

TCTCCTGGTG AAAAGTGTCC AGGTGAAACT TGGAGACTCC TGGGACGTGA AACTGGGAGC 240 

CTTAGGTGGG AATACCCAGG AAGTCACCCT GCAGCGAGGC GAATACATCA CAAAAGTCTT 300 

TGTCGCCTTC CAAGCTTTCC TCCGGGGTAT GGTCATGTAC ACCAGCAAGG ACCGCTATTT 360 

CTATTTTGGG AAGCTTGATG GCCAGATCTC CTCTGCCTAC CCCAGCCAAG AGGGGCAGGT 420 

GCTGGTGGGC ATCTATGGCC AGTATCAACT CCTTGGCATC AAGAGCATTG GCTTTGAATG 480 

GAATTATCCA CTAGAGGAGC CGACCACTGA GCCAOCAGTT AATCTCACAT ACTCAGCAAA 540 

CTCACCCGTG GGTCGCTAGG GTGGGGTATG GGGGCATCCG AGCTGAGGOC ATCTGTGTGG 600 

TGGTGGCTGA TGGTACTGGA GTAACTGAGT CGGGACGCTG AATCTGAATC CACCAATAAA 660 
TAAAGCTTCT GCAGAATCAG TGAAAAAAAA A 

Protein Accession ft RSENESH predicted 

1 11 21 31 41 51 

I I I I 1 I 

MLLLLTLALL GGPTWAGKMV GPGGGKYFST TEDYDHEITQ LRVSVGLLLV KSVQVKLGDS 60 
WDVKLGALGQ NTQEVTLQPG BYITKVFVAF QAPLRGMVMY TSKDRYPYPG KLDGQISSAY 120 
PSQBGQVLVG IYGQYQLLGI KSIGPEWNYP LEEPTTEPPV NLTYSANSPV GR 



SEQ ID NO:97 P09 DNA SEQUENCE 

NudeJc Add Accession ft NM_006953 

Coding sequence: 3348S{ underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CCGTTCCGCG CTCTGGCGGC TCCTCCOGGG CGATGCCTOC GCTCTGGGCC CTGCTGGCCC 60 

TOGGCTGOCT GCGGTTOGGC TCGGCTGTGA AOCTGCAGCC CCAACTGGOC AGTGTGACTT 120 

TCGOCAOCAA CAACCCCACA CTTACCACTG UXaaC C T TU GA AAAGCCTCTC TGCATGTTTG 180 

ACAGCAAAGA GGCCCTCACT GGCAOCCACG AGGTCTAOCT GTATGTCCTG GTCGACTCAG 240 

OCATTTCCAG GAATGCCTCA GTGCAAGACA GCACCAACAC COCACTGGGC TCAACGTTOC 300 
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TACAAACAGA GGGTGGGAGG ACAGGTCCCT ACAAAGCTGT GGCCTTTGAC CTGATCCCCT 360 

GCAGTGAOCT GCCCAGCCTG GATGCCATTG GGGATGTGTC CAAGGCCTCA CAGATCCTGA 420 

ATGCCTACCT GGTCAGGGTG GGTGCCAACG GGACCTGCCT GTGGGATCCC AACTTCCAGQ 480 

GCCTCTGTAA CGCACCCCTG TCGGCAGCCA CGQAGTACAQ GTTCAAGTAT GTCCTGGTCA 540 

ATATGTCCAC GGGCTTGGTA GAGGACCAGA CCCTGTGGTC GGACCCCATC CGCACCAACC 600 

AGCTCACCCC ATACTCGACG ATCGACACGT GGCCAGGCCG GCGGAGCGGA GGCATGATCG 660 

TCATCACrrC CATC CTGGGC TCCCTGCCCT TCTTTCTACT TGTGGGTTTT GCTGGCGCCA 720 

TTGOCCTCAG CCTCGTGGAC ATGGGQAGTT CTGATGGGGA AACGACTCAC GACTCCCAAA 780 

TCACTCAGGA GGCTGTTCCC AAGTCGCTGG GGGCCTCGGA GTCTTCCTAC ACGTCCGTGA 840 

ACCGGGGGCC GCCACTGGAC AGGGCTGAGG TGTATTCCAG CAAGCTCCAA GACTGAGCCC 900 

AGCACCACCC CTGGGCAGCA GCATCCTCCT CICTQGCCTT GCCOCAGGCC CTGCAGCGGT 960 

GGTTGTCACA OCCTGACTTC AGGGAAGGTG AAACAGGGCT TGTCCCTCCA ACTGCAGGAA 1020 
AACCCTTAAT AAAATCTTCT GATGAGTTCT AAAAAAAAA 

Protein Pr ° tGl NPJM8884 

1 11 21 31 41 51 

I I 1 I I 1 

MPPLWALLAL GCLRFGSAVN LQPQLASVTF ATNNPTLTTV ALEKPLCMFD SKEALTGTHE 60 

VYLYVLVDSA ISRHASVQDS TNTPLGSTFL QTEGGRTGFY KAVAFDLIPC SDLPSLDAIG 120 

DVSKASQILN AYLVKVGANG TCLWDPNFQG LCNAPLSAAT EYRFKYVLVN MSTGLVEDQT 180 

LWSDPZRTNQ LTPYST1DTW PGRRSGGMIV ITSILGSLPF PLLVGFAGAI ALSLVDMGSS 240 
DGETTBDSQI TQEAVPKSLG ASESSYTSVN RGPPLDRAEV YSSKLQD 

SEQDKO:99PEN1 DMA SEQUENCE 

NudefcAckJ Accession* NMJD12391 

Coding sequence 416-1423 (underlined sequences correspond to start and stop colons) 

1 11 21 31 41 51 

I I I I I I 

GTCTGACTTC CTCCCAGCAC ATTCCTGCAC TCTGCCGTGT CCACACTGCC CCACAGACCC 60 

AGTCCTCCAA GOCTGCTGCC AGCTCCCTGC AAGCCCCTCA GGTTGGGCCT TGCCACGGTG 120 

CCAGCAGGCA GCCCTGGGCT GGGGGTAGGG GACTCCCTAC AGGCACGCAG CCCTGAGACC 180 

TCAGAGGGCC ACCCCTTGAG GGTGGCCAGG CCCCCAGTGG CCAACCTGAG TGCTGCCTCT 240 

GCCACCAGCC CT G CTGgCCC CTOQTTOCCC TGGCCCCCCA GATGCCTGGC TGAGACACGC 300 

CAGTGGCCTC AGCTGCCCAC ACCTCTTCCC GGCCCCTGAA GTTGGCACTG CAGCAGACAG 360 

CTCCCTGGGC ACCAGGCAGC TAACAGACAC AGCCGCCAGC CCAAACAGCA GCGGCATGGG 420 

CAGCGCCAGC CCGGGTCTGA GCAGCGTATC CCCCAGCCAC CTCCTGCTGC CCCCCGACAC 480 

GGTGTCGCGG ACAGGCTTGG AGAAGGCGGC AGCGGGGGCA GTGGOTCTCG AGAGACGGGA 540 

CTGGAGTCCC AGTCCACCCG CCACGCCCGA GCAGGGCCTG TCCGCCTTCT ACCTCTCCTA 600 

CTTTGACATG CTGTACCCTG AGGACAGCAG CTGGGCAGCC AAGGCCCCTG GGGCCAGCAG 660 

TCGGGAGGAG CCACCTGAGG AGCCTGAGCA GTGCCCGGTC ATTGACAGCC AAGCCCCAGC 720 

GGGCAGCCTG GACTTGGTGC CCGGCGGGCT GACCTTGGAG GAGCACTCGC TGGAGCAGGT 780 

GCAGTCCATG GTGGTGGGCG AAGTGCTCAA GGACATCGAG ACGGCCTGCA AGCTGCTCAA 840 

CATCACCGCA GATCCCATGG ACTGGAGCCC CAGCAATGTG CAGAAGTGGC TCCTGTGGAC 900 

AGAGCACCAA TACCGGCTGC CCCCCATGGG CAAGGCCTTC CAGGAGCTGG CGGGCAAGGA 960 

GCTGTGCGCC ATGTCGGAGG AGCAGTTCCG CCAGCOCTCG CCCCTGGGTG GGGATGTGCT 1020 

GCACGCCCAC CTGGACATCT GGAAGTCAGC GGCCTGGATG AAAGAGCGGA CTTCACCTGG 1080 

GGCGATTCAC TACTGTGCCT CGACCAGTGA GGAGAGCTGG ACCG ACAGCQ AGGTGGACTC 1140 

ATCATGCTCC GGGCAGCCCA TCCACCTGTG GCAGTTCCTC AAGGAGTTGC TACTCAAGCC 1200 

CCACAGCTAT GGCCGCTTCA TTAGGTGGCT CAACAAGGAG AAGGGCATCT TCAAAATTGA 1260 

GGACTCAGCC CAGGTGGCCC GGCTGTGGGG CATCCGCAAG AACCGTCCCG CCATGAACTA 1320 

CGACAAGCTG AGCCGCTCCA TCCGCCAGTA TTACAAGAAG GGCATCATCC GGAAGCCAGA 1380 

CATCTCCCAG CGCCTCGTCT ACCAGTTCGT GCACCCCATC TGASTGCCTG GCCCAGGGCC 1440 

TGAAACCCGC CCTCAGGGGC CTCTCTOCTG CCTGCOCT G C CTCAGCCAGG CCCTGAGATG 1500 

GGGGAAAACG GGCAGTCTGC TCTGCTGCTC TGAOCTTCCA GAGCCCAAGG TCAGGGAGGG 1560 

GCAACCAACT GCCCCAGGGG GATATGGGTC CTCTGGGGCC TTCGGGACCA TGGGGCAGGG 1620 

GTGCTTCCTC CTCAGGCCCA GCTGCTCCCC TGGAGGACAG AGGGAGACAG GGCTGCTCCC 1680 

CAACACCTGC CTCTGACCCC AGCATTTCCA GAGCAGAGCC TACAGAAGGG CAGTGACTCG 1740 

ACAAAGGCCA CAGGCAGTCC AGGCCTCTCT CTGCTCCATC CCCCTGCCTC CCATTCTGCA 1800 

CCACACCTGG CATGGTGCAG GGAGACATCT GCACCCCTGA GTTGGGCAGC CAGGAGTGCC 1860 
CCCGGGAATG GATAATAAAG ATACTAGAGA ACTS 

SEQ D MO:1fl0 PEN1 Protein sequence 
Protein Accession* NP_036523 

1 11 21 31 41 51 

I I I I I I 

HGSASPGL5S VSPSHLLLPP DTVSRTGLEK AAAGAVGLER RDWSPSPPAT PEQGLSAPYL 60 

SYFDMLYPED SSHAAKAPGA SSREEPPEKP EQCPVXDSQA PAGSLDLVPG GLTLEEHSLE 120 

QVQSMWGEV LKDXETACKL LNITADPMDW SPSNVQKWLL WTEHQYRLPP KGKAFQBLAG 180 

KELCAMSEEQ FRQRSPLGGD VLHAHLDIWK SAAKMKERTS PGAIHYCAST SEESWTDSEV 240 

OSSCSGQPIB LWQFLKELLL KPHSYGRPIR WLNKEKGIFK IEDSAQVARL WGIRKNRPAH 300 
NYDKLSR5IR QYYKKGIIRK PDISQRLVYQ PVHPI 

SEQ ID KO:101 PEN3 DMA SEQUENCE 

Nudefc Acid Accession* NM_000742 

Oodng sequence: 5552144 (underlined sequences correspond to start and stop codons) 
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1 U 21 31 41 51 

I I I I I I 

GAGAGAACAG CGTGAGCCTQ TGT QC TTQT Q TGCTGAGCCC TCATCCCCTC CTGGGGCCAO 60 

GCTTGGGTTT CACCTGCAGA ATCGCTTGTG CTGGGCTGCC TGGGCTGTCC TCAGTGGCAC 120 

CTGGATGAAG OCGTTCTGGC TGCCAGAGCT GGACAGCCCC AGGAAAACCC ACCTCTCTGC 180 

AGAGCTTOCC CAGCTOTCOC CGGGAAGCCA AATGCCTCTC ATGTAAGTCT TCTGCTCGAC 240 

GGGGT GTCTC CTAAACCCTC ACTCTTCAGC CTCTGTTTGA CCATGAAATG AAGTGACTGA 300 

GCTCTRTTCT GTACCTQCCA CTCTATTTCT GGGGTQACTT TTGTCAGCTG CCCAQAATCT 360 

CCAAGCCAGG CTGGTTCTCT GCATCCTTTC AATGACCTGT TOTCTTCTQT AACCACAGGT 420 

TCGGTGGTGA GAGGAAGCCT CGCAGAATCC AGCAGAATCC TCACAGAATC CAGCAGCAGC 480 

TCTGCTGGGG ACATGGTCCA TGGTGCAACC CACAGCAAA6 CCCTGAOCT6 ACCTCCTGAT 540 

GCTCAGGAGA AOCCATGGGC CCCTCCTGTC CTGTGTTCCT GTCCTTCACA AAQCTCAGCC 600 

TGTGGTGGCT CCTTCTGACC CCAGCAGGTG GAGAGGAAGC TAAGCGCCCA CCTCCCAGGG 660 

CTCCTGGAGA CCCACTCTCC TCTCCCAGTC OCACGGCATT GCCGCAGGGA GGCTCGCATA 720 

CCCAGACTGA GGACCGGCTC TTCAAACACC TCTTCCGGGG CTACAACCGC T6G6CGC6CC 780 

CGGTGCCCAA CACTTCAGAC GTGGTGATTG TGCGCTTTGG ACTGTCCATC GCTCAGCTCA 840 

TCGATGTGGA TGAGAAGAAC CAAATGATGA CCACCAACGT CTGGCTAAAA CAGGAGTOOA 900 

GCGACTACAA ACTGCGCTGG AACCCCGCTG ATTTTGGCAA CATCACATCT CTCAGGGTCC 960 

CTTCTGAGAT GATCTGGATC CCCGACATTG TTCTCTACAA CAATGCAGAT GGGGA G TTTG 1020 

CAGTGACCCA CATGACCAAO GCCCACCTCT TCTCCACOGG CACTGTGCAC TGGGTGOCOC 1080 

CGGCCATCTA CAAGAGCTCC TGCAGCATCG ACGTCAOCTT CTTCCCCTTC GACCAGCAGA 1140 

ACTGCAAGAT GAAGTTTGGC TCCTGGACTT ATGACAAGGC CAAGATCGAC CTGGAGCAGA 1200 

TQGAGCAGAC TGTGGACCTQ AAGGACTACT GGGAGAGCGG CGAGTGGGCC ATCGTCAATO 1260 

CCACGGGCAC CTACAACAGC AAGAA.GTACG ACTGCTGCGC CGAGATCTAC CCCGACGTCA 1320 

CCTACGCCTT CGTCATCCGG CGGCTGCCGC TCTTCTACAC CATCAACCTC ATCATCCCCT 1380 

GCCTGCTCAT CTCCTGCCTC ACTGTGCTGG TCTTCTACCT GCCCTOCGAC TGCGGCGAGA 1440 

AGATCACGCT GTGCATTTCG GTGCTGCTGT CACTCACCGT CTTCCTGCTG CTCATCACTG 1500 

AGATCATCCC gtccacctcg ctggtcatcc CGCTCATCGG CGAOTACCTG CTGTTCACCA 1560 

TGATCTTCGT CACCCTGTCC ATCGTCATCA CCGTCTTCGT GCTCAATGTG CACCACOGCT 1620 

CCCCCAGCAC CCACACCATG CCCCACTGGG TGCG GGG GGC CCTTCTGGGC TGTGTGCCOC 1680 

GGTGGCTTCT GATGAACCGG CCCCCAGCAC CCGTGGAGCT CTGCCACCCC CTACGCCTGA 1740 

AGCTCAGCCC CTCTTATCAC TGGCTGGAGA GCAACGTGGA TGCCGAOGAG AGGGAGGTGG 1800 

TGGTGGAGGA GGAGGACAGA TGGGCATGTG CAGGTCATGT GGCCCCCTCT GTGGGCACCC 1860 

TCTGCAGCCA CGGCCACCTG CACTCTGGGG CCTCAGGTCC CAAGGCTGAG GCTCTGCTGC 1920 

AGGAGGGTGA GCTGCTGCTA TCACCCCACA TGCAGAAGGC ACTGGAAGGT GTGCACTACA 1980 

TTGCCGACCA CCTGCGGTCT GAGGATGCTG ACTCTTCGGT GAAGGAGGAC TGGAAGTATG 2040 

TTGCCATGGT CATCGACAGG ATCTTCCTCT G GCT GTT TA T CATCGTCTGC TTCCTGGGGA 2100 

CCATCGGCCT CTTTCTGCCT CCGTTCCTAG CTGGAATGAT CTGACTGCAC CTCCCTCGAG 2160 

CTGGCTCCCA GGGCAAAGGG GAGGGTTCTT GGATGTGGAA GGGCTTTGAA CAATGTTTAG 2220 

ATTTGGAGAT GAGCCCAAAG TGCCAGGGAG AACAGCCAGG TGAGGTGGGA GGTTGGAGAG 2280 

CCAGGTGAGG TCTCTCTAAG TCAGGCTGGG GTTGAAGTTT GGAGTCTGTC CGAGTTTGCA 2340 

GGGTGCTGAG CTGTATGGTC CAGCAGGGGA GTAATAAGGG CTCTTCCGGA AGGGGAGGAA 2400 

GCGGGAGGCA GGCCTGCACC TGATGTGGAG GTACAGGCAG ATCTTCCCTA CCGGGGAGGG 2460 

ATGGATGGTT GGATACAGGT GGCTGGGCTA TTCCATCCAT CTGGAAGCAC ATTTGAGCCT 2520 

CCAGGCTTCT CCTTGACGTC ATTCCTCTCC TTCCTTGCTG CAAAATGGCT CTGCACCAGC 2580 

CGGCCCCCAG GAGGTCTGGC AGAGCTGAGA GCCATGGCCT GCAGGGGCTC CATATGTCCC 2640 
TACGCGTGCA GCAGGCAAAC AAGA 

SEQ P mm PEN3 Protein secumco 
Protein Accession #: NP.000733 

1 11 21 31 41 51 

I I I I I I 

KGPSCFVFLS FTKLSLWWLL LTPAGGEEAK RPPPRAPGDP LSSPSPTALP OGGSHTETBD 60 

RLPKHLFRGY NRWARPVPNT SDWIVRFGL SIAQLUJVDB KNQKMTTNVW LKQEWSDYKL 120 

RWNPADPGNI TSLKVPSEMT WIPDIVLYKN ADGEFAVTHM TKAHLFSTGT VHWVPPAIYK 180 

SSCSH3VTPP PPDQQNCKMK FGSWTYDKAK IDLEQMEQTV DLKDYWESGB WAIYNATGTY 240 

NSKKYDCCAE IYEDVTXAFV IHHLPLFYTI NLIIPCLLIS CLTVLVPYLP SDCGEKITLC 300 

ISVLLSLTVF LLLITEIIPS TSLVIPLIGB YLLFTMIFVT LSIVITVFVL NVHHRSPSTH 360 

TKPHWVRGAL LGCVPHWLLM NRPPPPVELC HPLRLKLSPS YHWLESNVDA EEREWVEEE 420 

DRKACAGHVA PSVGTLCSHG HLHSGASGFK AEALLQEGEL LLSPHMQKAL EGVHYIADHL 480 
RSEDADSSVK EDWKYVAMVI DWFLWLFII VCFLGTIGLF LPPFLAGMI 

SEQ ID Nai03 PBM DNA SEQUENCE 

Nucleic AcW Accession*: NM_0 18670 

Coding sequence: 87-693 (uraierilned sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I i 1 

CACGAGGCTG GAAGGGGCCA CTTCACACCT CGGGCTCGGC ATAAAGCGGC CGCCGGCCGC 60 

CGGCCCCCAG ACGCGCCGCC GCTGOCATGG CCCAGCCCCT GtGCCCGCCG CTCTCOGAGT 120 

CCTGGATGCT CTCTGCGGCC TGGGGCCCAA CTCGGCGGCC GCCGCCCTCC GACAAGGACT 180 

GCGGCCGCTC CCTCGTCTCG TCCCCAGACT CATGGGGCAG CACCCCAGCC GACAGCCCCG 240 

TGGCGAGCCC CGCGCGGCCA GGCACCCTCC GGGACCCCCG CGCCCCCTCC GTAGGTAGGC 300 

GCGGCGCGCG CAGCAGCCGC CTGGGCAGCG GGCAGAGGCA GAGCGC CAGT GAGCGGGAGA 360 

AACTGCGCAT GCGCACGCTG GCCCGCGCCC TGCACGAGCT GCGCCGCTTT CTACCGCCGT 420 

CCGTGGCGCC CGCGGGCCAG AGCCTGACCA AGATCGAGAC GCTGCGCCTG GCTATCCGCT 480 
ATATCGGCCA CCTGTCGGCC GTGCTAGGCC TCAGCGAGGA GAGTCTCCAG CGCCGGTGCC " 540 

GGCAGCGCGG TGACGCGGGG TCCCCTCGGG GCTGCCCGCT GTGCCCCGAC GACTGCCCCG 600 

CGCAGATGCA GACACGGACO CAGGCTOAGG GGCAGGGGCA GGGGCGCGGG CTGGGCCTGG 660 
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TATCCGCCGT CCGCGCCGGG GCGTCCT6GG GATOOCCGCC T6CCTGCCCC GGAGCCCGAG 720 

CTGCACCCGA CCCGCGCGAC CCGCCTGCGC TGTTCGGCGA GGCGGCGTGC CCGGAAGGGC 780 

AGGCGATGGA GCCAAGCCCA CCGTCCCCGC TCCTTCCGGG CGACGTGCTO CCTCT O TTOO 840 

AGACCTGGAT GCCCCTCTCQ CCTCTGGAOT GGCTGCCTGA GGAGCCCAAG TOACAAGOGA 900 

CAACTGAOGC CGTCTCTGTG AGCACCGAGG CTTTTTGGCC TCAGCACCTT CGAAGTGGTT 960 

CCTTGGCAGA CTGCCTTTCC TGGAA6AG66 CACGGGCGAT CCC6AC66G0 GCATTCCTGC 1020 

GGGTOAGAGC OGTCCCCACC GCGGOGGCCC TTCTCAQCCC CTCCCTCCAT GGAGGGACCC 1080 

ATAGGGCTAG ACACTTTGAG GCAAGCAGGA GGCTCTGCCT AATGTGAATT TATTTATTTG 1140 
TGAATAAACT GTACTGGTGT CAAAAAAAAA AAAAAAAAAA A 



8WPHfttMPPWPWI»i«qwnw 
Protein Accession #: NP_061 140 

1 11 21 31 41 51 

I I I I I I 

HAQPLCPPLS ESWMLSAAWG PTRRPPPSDK DCGRSLVSSP DSWGSTPADS PVASPARPGT 60 

LRDPRA PSVQ RRGARSSRLG SGQRQSASER EKLRMRTLAR ALHELRRFLP PSVAPAGQSL 120 

TKIETLRIAI RYIGHLSAVL GLSEESLQRR CRQRGDAGSP RGCPLCPDDC PAQMQTRTQA 180 

EGQGQGRGLG LVSAVRAGAS WGSPPACPGA RAAPEPRDPP ALFAEAACPB GQAMEPSPPS 240 
PLLPGDVLAL LBTWHPLSPL EWLPEEPK 



SEQ m K0:105 PEU5 DNA SEQUENCE 

Nudete Add Accession ft NM_0 17636 

Coding sequence: 324-3374 (underlined sequences correspond to start end stop colons) 

1 11 21 31 41 51 

I I I ' I I I 

CCACGGAGAA GGCCACCGAT GCCTACGGAG AGCTGGACTT CACGGGGGCC GGOCGCAAGC 60 

ACAGCAATTT CCTCCGGCTC TCTGACCGAA CGGATCCAGC TGCAGTTTAT AGTCTGGTCA 120 

CACGCACATG GGGCTTCCGT GCCCCGAACC TGGTGGTGTC AGTGCTGGGG GGATCGGGGG 180 

GCCCCGTOCT CCAGAOCTGG CTGCACGACC TGCTGCGTCG TOGGCTGGTG CGGGCTGCCC 240 

AGAGCACAGG AGCCTGGATT GTCACTGGGG GTCTGCACAC GGGCATCGGC CGGCATGTTG 300 

GTGTGGCTGT ACGGGACCAT CAGATGGCCA GCACTSGGGG CACCAAGGTO GTGGCCATGG 360 

GTGTGGCCCC CTGGGGTGTG GTCCGGAATA GAGACACCCT CATCAACCCC AAGGGCTCGT 420 

TCCCTGCGAG GTACCGGTGG CGCGGTGACC CGGAGGACGG GGTCCAGTTT CCOCTGGACT 480 

ACAACTACTC GGCCTTCTTC CTGGTGGACG ACGGCACACA CGGCTGCCTG GGGGGOGAGA 540 

ACCGCTTCCG CTTGCGOCTG GAGTCCTACA TCTCACAGCA GAAGAOGGGC GTGGGAGGGA 600 

CTGGAATTGA CATCCCTGTC CTGCTCCTCC TGATTGATGG TGATGAGAAG ATGTTGACGC 660 

GAATAGAGAA CGCCACCCAG GCTGAGCTCC CATGTCTCCT CGTGGCTGGC TCAGGGGGAG 720 

CTGCGGACTG CCTGGOGGAG ACCCTGGAAG ACACTCTGGC CCCAGGGAGT GGGGGAGCCA 780 

GGCAAGGCCA AGCCCGAGAT CGAATCAGGC GTTTCTTTCC CAAAGGGGAC CTTGAGGTCC 840 

TGCAGGCCCA GGTGGAGAGG ATTATGACCC GGAAGGAGCT CCTGACAGTC TATTCTTCTG 900 

AGGATGGGTC TGAGGAATTC GAGAOCATAG TTTTGAAGGC OCTTGTGAAG GCCTGTGGGA 960 

GCTCGGAGGC CTCAGOCTAC CTGGATGAGC TGCGTTTGGC 'LVLWJCVIXAI AACOGCGTGG 1020 

ACATTGCCCA GAGTGAACTC TTTCGGGGGG ACATCCAATG GCGGTCCTTC CATCTCGAAG 1080 

CTTOCCTCAT GGACGCCCTG CTGAATGACC GGCCTGAGTT CGTGCGCTTG CTCATTTCCC 1140 

ACGGCCTCAG CCTGGCJCCAC TTCCTGACCC CGATGCGCCT GGCCCAACTC TACAGCGCGG 1200 

CGCCCTCCAA CK3GCTCATC CGCAACCTTT TGGACCAGGC GTCCCACAGC GCAGGCAOCA 1260 

AAGGCCCAGC OCTAAAAGGG CGA GCT GOGG AGCTCCGGCC CCCTGACGTG GGGGATGTGC 1320 

TGAGGATGCT GCTGGGGAAG ATGTGCG OG C OGAGGTACCC CT006GGGGC GOCTGGGACC 1380 

CTCACCCAGG CCAGGGCTTC GGGGAGAGCA TGTATCTGCT CTO GGA CAAG GCCACCTOGC 1440 

CGCTCTCGCT GG A TGCTGGC CTCGGGCAGG O CCCCT GGAG CGAOCTGCTT CTTTGGGCAC 1500 

TGTTGCTGAA CAGGGCACAG ATGGCCATGT ACTTCTGGGA GATGGGTTOC AATGCAGTTT 1560 

CCTCAGCTCT TGGGGOCTGT TT GC TGCTOC GGGTGATGGC AOGCCTGGAG CCTGACGCTG 1620 

AGGAGGCAGC AOGGAGGAAA GACCTGGCGT TCAAGTTTGA GGGGATGGGC GTTGACCTCT 1680 

TTGGCGAGTG CTATCGCAGC AGTGAGGTGA GGGCTGCCCG CCTCCTCCTC CGTCGCTGCC 1740 

CGCTCTGGGG GGATGCCACT TGCCTCCAGC TGGCCATGCA AGCTGACGOC CGmS ti CX'S C L* 1800 

TTGCCCAGGA TGGGGTACAG TCTCTGCTGA CACAGAAGTG GTGGGGAGAT ATGGCCAGCA 1860 

CTACACCCAT CTGGGCCCTG GTTCTCGCCT TCTTTTGCCC TCCACTCATC TACACCCGCC 1920 

TCATCACCTT CAGGAAATCA GAAGAGGAGC CCACACGGGA GGAGCTAGAG TTTGACATGG 1980 

ATAGTGTCAT TAATGGGGAA GGGCCTGTOG GGACGGCGGA OOCAGCCGAG AAGACGCCGC 2040 

TGGGGGTCCC GOGCCAGTOG GGCCGTCCGG GTTGCTGCGG GQ QOCCCTGC GGGGGGCGCC 2100 

GGTGCCTACG CCGCTGGTTC CACTTCTGGG GCGCGCCGGT GACCATCTTC ATGGGCAACG 2160 

TGGTCAGCTA OCTGCTGTTC TTGCTGCTTT TCTCGCGGGT GCTGCTCGTG GATTTCCAGC 2220 

CGGCGCCGCC CGGCTCCCTG GAGCTGCTGC TCTATTTCTG GGCTTTCAOG CTGCTGTGCO 2280 

AGGAACTGCG CCAGGGCCTG AGOGGAGGOG GGGGCAGCCT CGCCAGCGGG GGCCCCGGGC 2340 

CTGGCCATGC CTCACTGAGC CAGCGCCTGC GCCTCTACCT OGCCGACAGC TGGAACCAGT 2400 

GCGACCTAGT GGCTCTCACC TGCTTCCTCC TGGGCG7GGG CTGCCGGCTG ACCCCGGGTT 2460 

TGTACCACCT GGGCCGCACT GTCCTCTGCA TCGACTTCAT GOTTTTCACG GTGC GG CTGC 2520 

TTCACATCTT CACGGTCAAC AAACAGCTGG GGCCCAAGAT CGTCATCGTG AGCAAGATGA 2580 

TGAAGGAOGT GTTCTTCTTC CTCTTCTTCC TCGGCGTGTG GCTGGTAGOC TATGGCGTGG 2640 

CCACGGAGGG GCTCCTGAGG CCACGGGACA GTGACTTCCC AAGTATCCTQ CGCCGCGTCT 2700 

TCTACCGTCC CTACCTGCAG ATCTTCGGGC AGATTCCCCA GGAGGACATO GACGTGGCCC 2760 

TCATGGAGCA CAGCAACTGC TCGTOGGAGC OCGGCTTCTG GGCACACOCT CCTGG GG OCC 2820 

AGGCGGGCAC CTGCGTCTCC CAGTATGCC A ACTGGCTGGT GGTGCTGCTC CTCGTCATCT 2880 

TCCTGCTCGT GGCCAACATC CTGC TGQTCA ACTTGCTCAT TGOCATGTTC AGTTACACAT 2940 

TOGGCAAAGT ACAGGGCAAC AGCGATCTCT ACTGGAAGGC GCAGCGTTAC CGCCTCATCC 3000 

GGGAA3TOCA CTCTOGGOCC GCGCTGGCCC CGCCCTTTAT OG TCA TCTOC CACTTGCGCC 3060 

TCC TGCTCAG G CAATT GTGC AGGCGACOOC GGAGCCCCCA GC O G1CCTCX: COGGCOCTCG 3120 

AGCATTTOCG GOTTTACCTT TCTAAGGAAG CCGAGCGGAA GCTGCTAACG TGGGAATCGG 3180 
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TGCATAAGGA GAACTTTCTG CTGGCACGCG CTAGGGACAA GCGGGAGAGC GACTCCGAGC 3240 

GTCTGOAGCO CACGTCCCAG AAGGTGGACT T6GCACT6AA ACAGCTGGGA CACAT0C6C0 3300 

AGTACGAACA GCGCCTGAAA GTGCTG6AGC GGGAGGTCCA GCAGTGTAGC CGCGTCCT6G 3360 

GGTGGGTGAC GTAGGCOGTT AGCAGCTCTG CCATGTTCCC CTCAGGTGGG CCGCCACCCC 3420 

TTSACC TGCA TGGGTCCAAA GAGTGAGCCA T6CTG6CGGA TTTTAAGGAG AAGCCCCCAC 3480 

AGGGGATTTT GCTCTTAGAQ TAAGGCTCAT GTGGGCCTOG GCCCCCGCAC CTGGT6QCCT 3540 

TGTCCTTGAG GTGACCCCCA TGTCCATCTG GGCCACTCTC AGGACCACCT TTGGGAGTGT 3600 

CATCCTTACA AACCACACCA TGCCCGGCTC C7CCCA6AAC CAGTCCCAGC CTGGGAGGAT 3660 

CAAGGCCTGG ATCCC6GG0C GTTATCCATC TGGAGGCTGC AGGGTCCTTG GGGTAACAG6 3720 

GACCACAGAC CCCTCACCAC TCACAGATTC CTCACACTGG GGAAATAAAG CCATTTCA6A 3780 
GGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

SEQ IP M0:106 PEU5 Protein seouence 
Protein Accession #: NPJ0601O6 

1 11 21 31 41 51 

I I I I I i 

MASTGGTKW AMGVAPWGVV RNRDTLINPX GSFPARYRWR GDPEDGVQFP LDYNYSAFFL 60 

VDDGTHGCLG GENRFRLRLE SYISQQKTGV GGTGXDXPVL LLLHX3DEKM I/TRIENATQA 120 

QLPCLLVAGS GGAADCLAET LEDTLAF6SG GARQGEARDR IRRFPPKGDL EVLQAQVERI 180 

MTRKELLTVY SSEDGSEEPB TIVLKALVKA CGSSEASAYL DELRIAVAHN RVDIAQSELF 240 

RGDIQWRSFH LEASLHDALL NDRPEFVRLL ISHGLSLGHF LTPMRLAQLY SAAPSNSLIR 300 

NLLDQASHSA GTKAPALKGG AAELRPPDVG HVLRMLLGKH CAPRYPSGGA KDPHP6QGFG 360 

ESMYLLSDKA TSPLSLDAGL GQAFWSDLLL WALLLNRAOM AMYFWEMGSN AVSSALGACL 420 

LLRVMARLEP DAEEAARRKD LAPKPEGMGV DLFGECYRSS EVRAARLLLR RCPLWGDATC 480 

LQLAMQADAR AFFAQDGVQS LLTQKWWGDH ASTTPIWALV LAFFCPPLIY TRLITFRKSE 540 

EEPTREELEF DMDSVTNGEG PVGTADPAEK TFLGVPRQSG RPGCCGGRCG GRRCLRRWFH 600 

FWGAPVTIFM GNWSYLLFL LLFSRVLLVD FQPAPPGSLE LLLYFWAFTL LCEELRQGLS 660 

GGGGSLASGG PGPGHASLSQ RLRLYLADSW NQCDLVALTC FLLGVGCRLT PGLYHLGRTV 720 

LCIDFMVFTV RLLHIFTVNK QLGPKTVIVS KMMKDVFFFL FFLGWLVAY GVATEGLLHP 780 

RDSDFPSILR RVFYRPYLQI FGQIPQEDMD VALMEHSNCS SEPGFWAHPP GAQAGTCVSQ 840 

YANWLWLLL VIFLLVANIL LVNLLIAHFS YTFGKVQGNS OLYWKAQRYR LIREFHSRPA 900 

LAPPFIVISH LRLLLRQLCR RPRSPQPSSP ALEHFRVYLS KEABRKLLTW ESVHKENFLL 960 
ARARDKRESD SERLERTSQK VDLALKQLGH IREYEQRLKV LEREVQQCSR VLGWVT 

SEQ 10 N0:1 07 PEW3 DMA SEQUENCE 

Nucleic Add Accession*: NM.00S982 

Coding sequence: 276-1 130 (underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I I I I 

GGTAGCAGCA TCCACCGGGC GGGAGGTCGG AGGCAGCAAG GCCTTAAAGG CTACTGAGTG 60 

OGCOGGOCGT TCCGTGTCCA GAACCTCCCC TACTCCTCCG CCTTCTCTTC CTTGGCCGCC 120 

CACCGCCAAG TTCCGACTCC GGTTTTCGCC TTTGCAAAGC CTAAGGAGGA GGTTAGGAAC 180 

AGOCGCGCCC CCCTCCCTGC GGCCGCCGOC CCCTGOCTCT CGGCTCTGCT CCCTGCOGCG 240 

TGOGCCTGGG CCGTGCGCCC CGGCAGGCGC CAGCCATGTC GATGCTGCCG TCGTTTGGCT 300 

TTACGCAGGA GCAAGTGGCG TGCGTGTGOG AGGTTCTGCA GCAAGGCGGA AACCTGGAGC 360 

GCCTGGGCAG GTTCCTGTGG TCACTGCCCG CCTGCGACCA CCTGCACAAG AACGAGAGCG 420 

TACTCAAGGC CAAGGCGGTG GTCGCCTTCC ACCGCGGCAA CTTCCGTGAG CTCTACAAGA 480 

TCCTGGAGAG CCACCAGTTC TCGCCTCACA ACCAOCCCAA ACTGCAGCAA CTGTGGC7GA 540 

AGGOGCATTA CGTGGAGGCC GAGAAGCTGC GCGGOOGACC CCTGGGCGCC GTGGGCAAAT 600 

ATCGGGTGCG CCGAAAATTT CCACTGCCGC GCACCATCTG GGACGGCGAG GAGACCAGCT 660 

ACTGCTTCAA GGAGAAGTCG AGGGGTGTCC TCCGGGAGTG GTACGOGCAC AATCCCTACC 720 

CATCGCCGCG TGAGAAGCGG GA GCTGC CCG AGGCCACCGG CCTCACCACC ACCCAGGTCA 780 

GCAACTGGTT TAAGAACCGG AGGCAAAGAG ACOGGGCCGC GGAGGOCAAG GAAAGGGAGA 840 

ACACCGAAAA CAATAACTCC TCCTCCAACA AGCAGAACCA ACTCTCTCCT CTGGAAGGGG 900 

GCAAGCCGCT CATGTCCAGC TCAGAAGAGG AATTCTCAOC TCCCCAAAGT OCAGAOCAGA 960 

ACTCGGTCCT TCTGCTGCAG GGCAATATGG GCCAOGCCAG CSAGCTCAAAC TATTCTCTCC 1020 

CGGGCTTAAC AGCCTCGCAG CCCAGTCACG GCCTGCAGAC CCACCAGCAT CAGCTCCAAG 1080 

ACTCTCTGCT CGGOCCCCTC ACCTCCAGTC TGGTGGACTT GGGGTCCTAA GTGGGGAGGG 1140 

ACTGGGGCCT CGAAGGGATT CCTGGAGCAG CAACCACTGC AGCGACTAGG GACACTTGTA 1200 

AATAG AAATC AGGAACATTT TTGCAGCTTG TTTCTGGAGT TGTTTGCGCA TAAAGGAATG 1260 

GTGGACTTTC ACAAATATCT TTTTAAAAAT CAAAAOCAAC AGCGATCTCA AGCTTAATCT 1320 
CCTCTTCTCT OCAACTCTTT CCACTTTTGC ATTTTCCTTC CCAATGCAGA GA3CAGGG 

SEQ ID WO:108 PEW3 Protein sequence 
Protein Accession #: NP_005973 

1 11 21 31 41 51 

I I I I I I 

MSMLPSFGFT QEQVACVCEV LQQGGNLERL GRFLWSLPAC DHLHKNBSVL KAKAWAFHR 60 

GNFRELYKIL ESHQFSPHNH PKLQQLMliKA HYVEAEKLRG RPLGAVGKYR VRRKFPLPRT 120 

IWDGEETSYC FKEKSRGVLR EWYAHNPYPS PREKRELAEA TGLTTTQVSN WFKNRRQRDR 180 

AAEAKERENT ENNMSSSNKQ KQLSPLEGGK PLMSSSEEEF SPPQSPDQNS VLLLQGNMGH 240 
ARSSNYSLPG LTASQPSHGL QTHQHQLQDS LLGPLTSSLV DLGS 

SEQ ID NCM09 PW8 ONA SEQUENCE 

Nucleic Add Accession •: NM_0OSO69 

Coding sequence: 57-2060 (underfilled sequences correspond to start and stop codons) 



340 



WO 02/30268 



1 11 21 31 41 51 
I I I I I I 

GGGGCTCCGC GGGOCTGOAG CACGGCCGGG TCTAATATGC CCGGAGCOGA GGOGCGATQ A 60 
AGGAGAAGTC CAAGAATGCG GCCAAGACCA GGAGGG AGAA GGAAAATGGC GAGTTTTAOG 120 
AGCTTGCCAA GCTGCTCCCG CTGCCGTCGG CCATCACTTC GCAGCTGGAC AAAGCGTCCA 180 
TCATCCGOCT CACCACGAGC TACCTGAAGA TGCGCGCCGT CTTCCCCGAA GGTTTAGGAG 240 
ACGOOTCGOG ACAGOCGAGC CGCGCCOGG C CCCTGGAOGG OGTCGOCAAG GAGCTGGGAT 300 
CGCACTTGCT GCAGACTTTG GATGGATTTG 1111 rGTGGT AGCATCTGAT GGCAAAATCA 360 
TGTATATATC CGAGAOCGCT TCTGTOCATT TAGGCTTATC GCAGGTGG AG CTCACGGGCA 420 
ACAGTATTTA TGAATACATC CATOCITCTG ACCAOGATG A GATGACCGCT GTOCTCAOGG 480 
CCCAGCAGCC GCTGCACCAC GACCTGCTCC AAG AGTATG A GATAG AGAGG TCGTTCTTTC 540 
TTCGAATGAA ATGTGTCTTG GGGAAAAGGA ACGCGGGCCT GACCTGCAGC GGATACAAGG 600 
TCATCCACTO CAGTGGCTAC TTQAAGATCA GGCAGTATAT GCTGO AC ATO TCCCTGTACG 660 
ACTCCTtjCTA CCAGATTGTG GGGCTGGTGG CCGTGGGCCA GTCGCTGCCA CCCAGTGCCA 720 
TCAOCGAG AT CAAGCTGTAC AGTAACATGT TCATGTTCAG GGCCAGCCTT GACCTGAAGC 780 
TGATATTCCT GGATTCCAGG GTG ACCG AGG TGACGGGTTA CGAGOCGCAG GACCTGATOG 840 
AG AAGACCCT ATACCATCAC GTGCAOGGCT GCGACGTGTT CCACCTCCGC TACGCACACC 900 
ACCTCCTGTT GGTGAAGGGC CAGGTCACCA CCAAGTACTA CCGGCTGCTG TCCAAGCGGO 960 
GCGGCTGGGT GTGGGTGCAG AGCTACGCCA CCGTGGTGCA CAACAGOGGC TGGTCOGGGC 1020 
COCACTGCAT CGTGAGTGTC AATTATGTAC TCAOGGAGAT TGAATACAAG GAACTTCAGC 1080 
TGTCCCTGGA GCAGGTGTCC ACTGCCAAGT CCCAGGACTC CTGGAGGACC GCCTTGTCTA 1140 
CCTCACAAGA AACTAGG AAA TTAGTG AAAC OCAAAAATAC CAAG ATGAAG ACAAAGCTGA 1200 
GAACAAACCC TTACCCCCCA CAGCAATACA GCTCGTTCCA AATGGACAAA CTGGAATGCG 1260 
GCCAGCTCGG AAACTGGAGA GCCAGTCCOC CTGCAAGCGC TGCTGCTCCT OCAGAACTGC 1320 
AGCCCCACTC AGAAAGCAGT GACCTTCTGT ACACXiCCATC CTACAGCCTG CCCTTCTCCT 1380 
AOCATTACGG ACACTTCCCT CTGGACTCTC ACGTCTTCAG CAGCAAAAAG CCAATGTTGC 1440 
OGGCCAAGTT CGGGCAGCCC CAAGGATCGC CTTGTGAGGT GGCACGCTTT TTGCTGAGCA 1500 
CACTGCCAGC CAGCGGTGAA TGCCAGTGGC ATTATGCCAA CCCCCTAGTG CCTAGCAGCT 1560 
CGTCTCCAGC TAAAAATCCT CCAGAGCCAC CGGCGAACAC TGCTAGGCAC AGCCTGGTGC 1620 
CAAGCTACGA AGCGCCCGCC OCOGCCGTGC GCAGGTTCGG CGAGGACACC GCGCCCCCGA 1680 
GCTTCCCGAG CTGCGGCCAC TACCGOGAGG AGCCCGCGCT GGGCCCGGCC AAAGCCGCCC 1740 
GCCAGGCCGC CCGGGACGGG GCGCGGCTGG CGCTGGCCCG CGCGGCACCC GAGTGCTCCG 1800 
CGCOCCCGAC CCCCGAGGCC CCGGGCGCGC CGGCGCAGCT GCCCTTCGTG CTGCTCAACT 1860 
ACCACCGCGT GCTGGCCCGG CCCGG ACCGC TGGGGGGCGC CGCACCCGCC GCCTCCGGCC 1920 
TGGCCTGCGC TCCOGGCGGC CCCGAGGCGG CGACCGGCGC GCTGCGGCTC CGGCACCCGA 1980 

TCATCACCAA CGGGAGGIQA, CCCQCTQGCC GOOCGCGCCA GGAGCCTGGA CCCGGCCTCC 2100 
CGGGGCTGCG GCGCCACCG A GCCCGGCAAA TGCGCACG AC CTACATTAAT TTATGCAGAG 2160 
ACAGCTGTTT GAATTGGAOC CCGCCGCCG A CTTGOGGATT TOCACCGCGG AGGCCCOGCG 2220 
CGCCGCTGCC G AGGGCCGAG G AGCGCCCGG GTCCGGGCAG GTGACCGCCC GCCTCTGTCC 2280 
TGCGAGGGCC GGTGCGACCC AGTTGCTGGG GGCTTGGTIT CCTCACCTTG AAATCGGGCT 2340 
TCACGCGTCT TGCCTTGTCC CCAACGTTCC ACAACAGTCC CGCTGGGGGA TTGAAGCGGT 2400 
TTCACTCCGC AAATATCCTC CACTTTCAGG AGGGAAAACC CACCCTACCA CAGTCCGCIC 2460 
TTCCAAOTGG ACGGCAG ACC TGGGAGGGGA CGCCTGTGTC ACGAGCCCTT TTAGATGCTT 2520 
AGGTGAAGGC AG AAGTGATG ATTGTAAGTC CCATGAATAC ACAACTCCAC TGTCTTTAAA 2580 
AGTCATTCAA GAGTCTCATT A l l l ' l ' lUnT TTATTTAACC CmiTIC AA TACAAAAAGC 2640 
CAACAAACCA AGACTAAGGG GGTGACCATG CAATTCCATT TTGTGTCTGT GAACATAGGT 2700 
GTGCTTCCCA AATACATTAA CAAGCTCTTA CTTCCCCCTA ACCCCTATG A ACTCTTG ATA 2760 
ACACCAAG AG TAGCACCTTC AO AATATATT GAATAGGCAT TAAATGCAAA AATATATATG 2820 
TAGCCAG ACA GTTTATGAGA ATGACCCTGT CAAGCTTCAT TATTACGTGG CAAAATCCCT 2880 
CTGGCCCACA CAGATCTGTA ATTCACTAGG CTCGTGTTTG CTACAAATAG TGCTAATAAA 2940 
GTTAAATTGC ACGTGCAATA CGGAACACTG TCAATGGACT GCACCTTGTG AAGGAAAAAC 3000 
ATGCTTAAGG GGGTGTAATG AAAATGATGT AGACATTTTA AGCATTTTCT ACACAGCGAG 3060 
AAAACTTCGT AAGAACATGT TACGTGTGCA ACAGGTAAAC AGAAATCCTT TCATAAAGCA 3120 
CCAGCAGTGT TTAAAAAATG AGCTTCCATT AATTTTTACT TTTTATGGGT TTTGCTTAAA 3180 
GATCTCAACA TGG AAAAATC CTGTCATGGC TCTGAACTGC ACAATGCATT GAAOCGCCGT 3240 
CCTTCAATTT TCTTCACACT ATCAACACTG CAGCATTTTG CTGCTTTATC AAAATGGTTT 3300 
ATTTTAGGAA AC1TI 1 I CCA CCTTTCTGAA TGGAAAGAGG TTTTCACAAA TGTTTTAAAC 3360 
TCATCGTTCT AAAATCAAGT GCACCTACAC CAACTGCTCT CAAAATGTGA ACTG ACTTTT 3420 
TTTTTTTTTT TTTTGCCAAC CCTGTGTCAC TTAGTGAGOA CCTGACACAA TCCCTACAGG 3480 
GTGTCTGTCA GTGGGCCTCA TGGTAAGAGT CACAATTTGC AAATTTAGG A CCGTGGGTCA 3540 
TGCAGCGAAG GGGCTGG ATG GTAGGAAGGG ATGTGGOCGC CTCTOCACGC ACTCAGCTAT 3600 
ACCTCATTCA CAGCTCCTTG TGAGTGTGTG CACAGGAAAT AAGCCGAGGG TATTATTTTT 3660 
TTATGTTCAT GAGTCTTGTA A TTAAACCGT GATTCTTGAA AGGTGTAGGT TTGATTACTA 3720 
GGAGATACCA COGACATTTT TCAATAAAGT ACTGCAAAAT GCTTTTGTGT CTACCTTGTT 3780 
ATTAACTTTT GGGGCTGTAT TTAGTAAAAA TAAATCAAGG CTATCGGAGC AGTTCAATAA 3840 
CAAAGGTTAC TGTTGAGAAA AAAGACCCTA TCATAGATTT ACAAG 



S£QIDN0:11QE 
Protein Accession* NP.0050G0.1 

1 U 21 31 41 51 
I I I I I I 

MKEKSKNAAK TRREKENGEF YELAKLLPLP SAITSQLDKA SHRLTTSYL KMRAVFPEGL 60 
GDAWGQPSRA GPLDGV AKEL GSHLLQTLDG FVFWASDGK IMYISETAS V HLGLSQVELT 120 
GNSIYEYIHP SDHDEMTAVL TAHQKJfflHL LQEYEEERSF FLRMKCVLAK RNAGLTCSGY 180 
KVIHCSGYLK IRQYMLDMSL YDSCYQIVGL VAVGQSLPPS A1THKLYSN MFMFRASLDL 240 
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KUFLDSRVT EVTGYEPQDL EKTLYHHVH GCD VFHLRYA HHLLLVKGQV TTKYYRIXSK 300 
RGOWVWVQSY ATWHNSRSS RPHOVSVNY VLTEIEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQETRKLV KPKNTKMKTK LRTNFYPPQQ YSSFQMDKLE CGQLGNWRAS PPASAAAPPE 420 
LQPHSESSDL LYTPSYSLPF SYHYG HFPLD SHVFSSKKPM LPAKFGQPQG SPCEVARFFL 480 
STLPASGBCQ WHYANPLVPS SSSPAKNPPE PPANTARHSL VPSYEAPAAA VRRFGEDTAP 540 
PSFPSCGHYR EEPALGPAKA ARQAARDGAR LALARAAPEC CAPPTPEAPO APAQLPFVLL 600 
NYHRVLARRG PLGG AAPAAS GLACAPGGPE AATG ALRLRH PSPAATSPPG APLPHYLG AS 660 
VHTNGR 



SEQ ID N0:111 PFJ7 DMA SEQUENCE 

Nudefc Add Accession* NM.006549 

Coc5ng sequence: M254 (undefined sequences correspond to start end stop codons) 

1 11 21 31 41 51 
I I I I I I 

A2SAAOGGAC GCTGCATCTG OOOGTOOCTG CCCTACTCAC OOGTCAGCTC CCCGCAGTCC 60 
TCGGCICGGC TGCOOCGGCG GOOGACAGTG G AGTCTCACC ACGTCTOCAT CAOGGGTATG 120 
CAGGACTGTG TGCAGCTGAA TCAGTATACC CTGAAGGATG AAATTGGAAA GGGCTCCTAT 180 
GGTGTOGTCA AGTTGGCCTA CAATGAAAAT GACAATACCT ACTATGCAAT GAAGGTOCTG 240 
TCCAAAAAG A AGCTG ATOCG GCAGGCCGGC TTTOCACGTC GOGCTCCAOC CGGAGGCAGC 300 
OGGOCAGCTC CTGGAGGCTG CATOCAGOGC AGGGGOCOCA TTGAGCAGGT GTAOCAGGAA 360 
ATTGOCATCC TCAAGAAGCT GGACCACCCC AATGTGGTGA AGCTGGTGGA GGTCCTGGAT 420 
GACCCCAATG AGGACCATCT GTACATGGTG TTCGAACTGG TCAACCAAGG GOGOGTGATG 480 
GAAGTGOOCA CCCTCAAACC ACTCTCTG AA G ACCAGGCCC GTTTCTACTT CCAGG ATCTG 540 
ATCAAAGGCA TCGAGTACTT ACACTACCAO AAGATCATOC AOCGTGACAT CAAAOCITCC 600 
AACCTCCFGG TCGGAGAAGA TGGGCACATC AAGATCGCTC ACTFTGGTOT GAGCAATGAA 660 
TTCAAGGGCA GTG ACGCGCT CCTCTCCAAC ACOGTGGGCA GGGCCGGCIT CATGGCAOOC 720 
GAGTCGCTCT CTGAGACCCG CAAGATCTTC TCTGGGAAGG OCTTGGATGT TTGGGCCATG 780 
GGTGTG ACAC TATACTGCTT TGTCTTTGGC CAGTGCCCAT TCATGGACGA GCGGATCATG 840 
TGTTTACACA GTAAGATCAA GAGTCAGGCC CTGGAATTTC CAGACCAGCC CGACATAGCT 900 
GAGGACTTG A AGGACCTG AT CACCOGTATG CTGGACAAGA ACGCCGAGTC GAGGATCGTG 960 
GTGCCGG AAA TCAAGCTGCA COCCTGGGTC ACG AGGCATG GGGOGGAGOC GTTGCCGTGG 1020 
GAGGATGAGA ACTGCACGCT GGTOGAAGTG ACTGAAGAGG AGGTCGAGAA CTCAGTCAAA 1080 
CACATTOOCA GCTTGGCAAC CGTG ATCCTG GTGAAGACCA TGATAOGTAA AOGCTOCTTT 1140 
GGG AAOOCAT TCGAGGGCAG CCGGCGGGAG GAACGCTCAC TGTCAGCGCC TGGAAACTTG 1200 
CTCACCAAAA AACCAACCAG GG AATGTG AG TOOCTGTCTG AGCTCAAG AC CI&QAAAATA 1260 
AGTCCCCTTC CTGGCTGTTG CAAAGTAACG TAAGAGTTOC CTCACCCGAG TGGATGCAGA 1320 
CGTICITGCT GTCAGCCAOC TTOCTTCATA CACATAGOCA GCCCAGGGTG ACCAG AACGT 1380 
CCCAGGACAG ATGAGGCTTT GTGTGCTTAT G AGAGTGGGA GAAOCTGGTG GGCACCCCTG 1440 
GTGCAGGTGC TGTGGTGGGT GGGGACCCCA CTGCCTTTCC CACTGAGCAC ATCATGGCTA 1500 
CCTGACTTGG TGGGAGTTGC ATTCAGTCAC TTCTGTTTCT TAAACATAGC TTTACTGAGG 1560 
TACAATTCAC ATAOCATGTA ATTCACOCAC GGGAAOTGTA TGATTCAGTG GTXTCTAATA 1620 
CACACTICTG CAGCCATTAC CACCGTCAAC TTTACG ACAT TTTCATCAGC CCAAGAAGAC 1680 
ACOCTACACT CCTTAGCTGT CCCCATCCAA CTCCCCCACC CCAGTAACCA CTCAGAATAG 1740 
GTATGG ATXT GCCTATTCTG GACGTTTCGT ATAAATGGCG TCATACACTA AAAAAAAAAA 1800 
AAAA 



SEQ tD NOi112 PFJ7 Protein scouencg 
Protein Accession #: KP_00654ai 

1 U 21 31 41 51 
I I I I I I 

MNGRQCPSL PYSPVSSPQS SPRLPRRPTV ESHHVSITGM QDCVQLNQYT LKDEIGKGSY 60 
G WKLAYNEN DNTYYAMKVL SKKKURQAG FPRRPPPRGT RPAPGGGQP RGPIEQVYQB 120 
IAILKKLDHP NWKLVEVLD DPNEDHLYMV FELVNQGPVM EVPTLKPLSE DQARFYFQDL 180 
DCGIEYLHYQ KIIHRDIKPS NLLVGEDGHI KIADFG VSNE FKGSDALLSN TVGTPAFMAP 240 
ESLSBIRMF SGKALDVWAM GVTLYCFVFO QCPFMDERIM CLHSKIKSQA LEFPDQPDIA 300 
EDLKDUTRM LDKNPESRIV VPEOCLHPWV TRHG AEPLPS EDENCTLVEV TEEEVENSVK 360 
HIPSLATVIL VKTMIRKRSF GKPFEGSRRE ERSLSAPGNL LTKKPTRECB SLSELKT 



SEQ DNftlia PFJB DNA SEQUENCE 

Nucleic Add Accession f: NM_021810 

Coring sequence: 1429 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

AlfiAAACCTC TGATATGGAC ATCGTCAGAT GTTGAAGGCC AG AGGCCGGC TCTGCTCATC 60 
TGCACAGCIG CAGCAGG ACC CACGCAGGGA GTTAAGGGTT ATGGCAAGCC CTTTGAGOCA 120 
AGAAGTGTGA AAAACATACA CTCTACTCCT GCTTAOCCAG ATGCCACAAT GCACAGACAA 180 
CTDCTGGCTC OGGTGGAAGG AAGGATGGCA GAGACATTGA ATCAG AAACT GCATGTTGGC 240 
AATGTGCTGG AAGA1X3AOGC OGGCTAOCTA CCTCACGTCT ACAGCGAGGA AGGGGAGTGT 300 
GGAGGGGGOC CATCOCTCAG CTCICFGGOC AGCTTGGAAC AGGAGTTGCA ACCTG ATTTG 360 
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PCT/US01/32045 



CTGOACTCTT TGGGTTCAAA AGCG ACTGCG TTTG AGGAAA TATATTCAGA GTCAGGTGTT 420 
CCITCCTAA 



5 SEQ ID N0:114 PFJS Protein sequence: 
Protein Accession ft NPJJ68582.1 

1 U 21 31 41 51 
1A I I I I I I 

10 MKPUWTWSD VEGQRPALLI CTAAAGPTQG VKGYGKPFEP RSVKNIHSTP AYPDATMHRQ 60 
ILAPVEGRM A ETLNQKLHVA NVLEDDPGYL PHVYSBEGEC GGAPSLSSLA SLEQELQPDL 120 
LDSLGSKATP FEHYSESGV PS 



15 

SEQ ID NO:11 5 PFJS DNA SEQUENCE 

Nucleic Acid Accession ft NM.008361 

Coding sequence: 131-985 {underlined sequences correspond to start and stop codorts) 

20 1 11 21 31 41 51 

GGAATGCAGG CGACTTGOGA GCTGGGAGCG ATTTAAAACG CTTTGGATTC CCCCGGCCTG 60 
GGTGGGG AG A GCG AGCTGGG TGCCCCCTAG ATTCCCOGCC COOGCACCTC ATG AGCCG AC 120 
_ CCTCGGCTCC ATGGAGCCCG GCAATTATGC CACCITGGAT GG AGCCAAGG ATATOGAAGQ 180 

25 CITGCTGGGA GCGGG AGGGG GGCGGAATCT GGTCGOCCAC TCCCCTCTGA CCAGOCACCC 240 
AGCGGCGCCT AOGCTGATGC CTGCTGTCAA CTATGCCCCC TTGGATCTGC CAGGCTCGGC 300 
GG AGCCGCCA AAGCAATGCC ACCCATGCCC TGGGGTGCCC CAGGGGACGT CCOCAGCTCC 360 
CGTGCCITAT GGTTACTTTG GAGGCGGGTA CTACTCCTGC CGAGTGTOOC GGAGCTCGCT 420 
GAAACCCTGT GCCCAGGCAG CCACCCTGGC CGCGTACCCC GCGGAGACTC CCACGGCCGG 480 

30 GGAAG AGTAC OOCAGTOGOC OCACTGAGTT TCOCTTCTAT OOGGGATATC OGGGAACCTA 540 
CCACGCTATG GCCAGTTAOC TGG AOGTGTC TGTGGTGCAG ACTCTGGGTG CTCCTGGAG A 600 
ACCGCGACAT GACTCCCTGT TGCCTGTGGA CAGTTACCAG TCTTGGGCTC TCGCTGGTGG 660 
CTGGAACAGC CAGATGTGTT GCCAGGGAGA ACAG AACOCA CCAGGTCCCT TTTGGAAGGC 720 

„ AGCATTTGCA GACTCCAGCG GGCAGCACCC TGCTGACGCC TGCGCCTTTC GTCGOGGCCG 780 

35 CAAGAAACGC ATTOCGTACA GCAAGGGGCA GTTGOGGGAG CTGGAGCGGG AGTATGCGGC 840 
TAACAAGTTC ATCACCAAGG ACAAGAGGOG CAAGATCTCG GCAGOCACCA GCCTCTCGGA 900 
GCGCCAGATT ACCATCTGGT TTCAGAAOCG COGGGTCAAA GAGAAG AAGG TTCTCGCCAA 960 
GGTGAAGAAC AGCGCTACCC CTTAAGAGAT CTCCTTGCCT GGGTGGGAGG AGCGAAAGTG 1020 
GGGGTGTCCT GGGGAGACCA GAAACCTGCC AAGCCCAGGC TGGGGCCAAG GACTCTGCTG 1080 

40 AGAGGCOOCT AGAGACAACA CCCTTCOCAG GCCACTGGCT GCTGGACTGT TCCTCAGGAG 1140 
OGGCCTGGGT ACOCAGTATG TGCAGGGAGA CGGAAGCOCA TGTGACAGGC OCACTOCACC 1200 
AGGGTTCCCA AAGAACCTGG CCCAGTCATA ATCATTCATC CTCACAGTGG CAATAATCAC 1260 
GATAACCAGT 

45 

SEQ ID tiOim PFJS Protein sequence: 
Protein Accession ft NPJKW352.1 

50 1 11 21 31 41 51 

MEPGNYATLD GAKDIBGLLG AGGGRNLVAH SPLTSHPAAP TLMPAVNYAP LDLPGSAEPP 60 
KQCHPCPGVP QGTSPAPVPY GYPGGGYYSC RVSRSSLKPC AQAATLAAYP AETPTAGEEY 120 
PSRPTEFAFY PGYPGTYHAM AS YLDVSWQ TLG APGEPRH DSLLPVDSYQ SWALAGGWNS 180 
55 QMCCQGEQNP PGPFWKAAFA DSSGQHPPDA CAFRRGRKKR IPYSKGQLRE LEREYAANKF 240 
ITKDKRRKIS AATSLSERQI TIWFQNRRVK EKKVLAKVKN S ATP 



60 SEQ ID NO:117 PFJ4 DNA SEQUENCE 

Nucleic Add Accession ft NM 005628 

Cooing sequence: 591-2216 (underlined sequences correspond to start end stop codons) 

1 11 21 31 41 51 
65 | | | | | | 

GTAACCGCTA CTCCCGGACA CCAGACCACC GCCTTCCGTA CACAGGGGCC CGCATCCCAC 60 
CCTCOCGGAC CTAAGAGCCT GGGTCCOCTG TTTCOGGAGG TCCGCTTOOC GGCCCCCAGA 120 
T7XTGGCATC GCAGGCCTCA GTGTCCAAG A OCCAGGCAGC CCGGGTCCXX GGCTOCOGGA 180 
TXCAGGCGTC GGGG ATCTGC GCCACCAG AA OCTAGCCICC TGCAGAOCTC CGOCATCTGG 240 
70 GGGCACTCAA CCTCCTGGAG CCAAGGGCOC CACGTCCCAC CCA GAGA AAC TCTCGTATTC 300 
CCAGCTCCTA GGGCCAAGGA ACCOGGGCGC TCOGAACTCC CAGCTTTCGG ACATCTGGCA 360 
CACGGGGCAG AGCAGAG AAG CTCAGCGCCC AGOCTGGGGA ATTTAAACAC TCCAGCTTOC 420 
AAGAGCCAAG G AACTTCAGT GCTGTGAACT CACAACTCTA AGO AGCCCTC CAAAGTTCCA 480 
„ GTCTOCAGGT GCTGTTACTC AACTCAGTCC TAGGAACGTC GGGTCCTGGG AAGGAGGCXZA 540 
75 AGCGCTCCCA GCCAGCTTCC AGGCG CTAAG AAAOCOOGGT GCTTCCCATC ^TGGTGGCCG 600 
ATCCTCCTCG AGACTCCAAG GGGCTCX5CAG CGGCGGAGCC CACCGCCAAC GGGGGCCTGG 660 
CGCTGGOCTC CATCGAGGAC CAAGGCGCGG CAGCAGGOGG CTACTGCGGT TCCOGGGAOC 720 
AGGTGOGCOG CTGCCTTCGA GOCAAOCTGC TTGTGCTGCT G ACAGTGGTG GCCGTGGTGG 780 
CCGGCGTGGC GCTGGGACTG GGGGTGTCGG GGGCCGGGGG TGCGCTGGCG TTGGGCCCGG 840 
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AOCGCTTGAG CGCCTTCGTC TTOCCGGGCG AGCTGCTGCT CCGTC1GCTG CGGATGATCA 900 
TCTTGQOGCT GOTCGTOTCC AGCTTG ATCO GOGG CGCCGC CAGCCTGGAC CCCGGCGCGC 960 
TCGGCOGTCT GGGCGCCTGG GCGCTGCTCT 1 1 1 ICCTGGT CACCACGCTG CTGGCGTCGG 1020 
OGCTCGOAGT CGGCTTGGOG CTGGCTCTGC AGCCGGGOGC CGOCTOGGOC GOCATCAACG 1080 
CCTCCGTGGG AGCCGCGGGC AGTGOCG AAA ATGOOOOCAG CAAGGAGGTQ CTOGATTCGT 1140 
TCCTGGATCT TGCGAGAAAT ATCTTCCCTT CCAACCTGGT GTCAGCAGCC TTTCGCTCAT 1200 
ACTCTACCAC CTATG AAG AG AGG AATATCA CCGGAAOCAG GGTGAAGGTG CCCGTGGGGC 1260 
AGGAGGTGGA GGGGATGAAC ATCCTGGGCT TGGTAGTGTT TGCCATCGTC TTTGGTGTGG 1320 
CGCTGCGGAA GCTGGGGOCT G AAGGGG AGC TGCTTATOCG CTTCTTCAAC TOCTTCAATG 1380 
AGGOCACCAT GGTTCTGGTC TCCTGGATCA TGTGGTACGC 00CTOTGGGC ATCATGTTCC 1440 
TOGTOOCTGG CAAGATCGTG GAGATGC AGO ATGTGGGTTT A CICTI 1G CC CGCCTTGGCA 1300 
AGTACATTCT GT G CT G CCTG CTGGGTCAOG CCATCCATGG GCTCCTGGTA CTGOOCCTCA 1560 
TCTACTTOCT CTTCACODGC AAAAAOOCCT ACOGCTTOCT GTGGGGCATC GTGAOGOOGC 1620 
TGGCCACTGC CTTTGGGACC TCTTOCAGTT OCGOCACGCT GCCGCTG ATG ATG AAGTGCG 1680 
TGGAGG AGAA TAATGGCGTG GOCAAGCACA TCAGCCGTTT CATOCTGCOC AT0GGGGCCA 1740 
CXX3TCAACAT GG ACGGTGCC GOGCTCTTOC AGTGOGTGGC CGCAGTGTTC ATTGCACAGC 1800 
TCAGCCAOCA GTCCTTGGAC TTCGTAAAOA TCATCACCAT OCTGGTCACG GCCACAGCGT 1860 
CCAGCGTGGG GGCAGCGGGC ATCCCTGCTG GAGGTGTCCT CACTCTGGCC ATCATCCTCG 1920 
AAGCAGTCAA OCIOOOGGTC G ACCATATCT CCTTG ATOCT GGCTGTG GAC TGGCTAGTCG 1980 
ACCGGTCCTG TACCGTCCIC AATGTAGAAG GTGACGCTCT GGGGGCAGGA CTCCTOCAAA 2040 
ATTATGTGGA CCGTACGGAG TCG AGAAGCA CAGAGCCTGA GTTGATACAA GTGAAGAGTG 2100 
AGCTGCOCCTGGATOCGCTG OCAGTCCOCA CTGAGGAAGG AAACCOCCTC CTCAAACACT 2160 
ATCGGGGGOC CGCAGGGGAT GCCACGGTCG OCTCTGAGAA GGAATCAGTC ATGIM ACCC 2220 
GGGGAGGGAC CTTCCCTGCC CTGCTGGGGG TGCTCTTTGG ACACTGGATT ATGAGG AATG 2280 
GATAAATGGA TGAGCTAGGG CTCTGGGGGT CTGCCTGCAC ACTCTGGGGA GCCAGGGGCC 2340 
CCAGCACOCT CCAGGACAGG AGATCTGGGA TGCCTGGCTG CTGGAGTACA TGTGTTCACA 2400 
AGGGTTACTC CTCAAAACOC QCAGTTCTCA CTCATGTCCC CAACTCAAGG CTAGAAAACA 2460 
GCAAGATGGA GAAATAATGT TCTCCTGCGT OCCCAOGGTG ACCTGOCTGG OCTOOOCTGT 2520 
CTCAGGGAGC AGGTC ACAGG TCAOCATGGG GAATTCTAGC CCCCACTGGQ GGGATGTTAC 2580 
AACACCATGC TGGTTAi ill GGOGGCTGTA GTTGTGGGGG GATGTGTGTG TGCACGTGTG 2640 
TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG TTCTGTG AOC TCCTGTOCCC ATGGTACGTC 2700 
CCACCCTOTC CCCAGATCCC CTATTCCCTC CACAATAACA GAAACACTCC CAGGGACTCT 2760 
GGGG AGAGGC TGAGG ACAAA TACCTOCTGT CACTCCAGAG GACATTTTn* TTAGCAATAA 2820 
AATTGAGTGT CAACTATTTA AAAAAAAAAA AAAAAA 



SEQ ID N0:118 Pf J4 prftfq imVM 
Protein Accession*: NP.005619.1 

1 It 21 31 41 51 
I I I I I I 

MVADPPRDSK GLAAAEPTAN GGIALASEED QGAAAGGYOG SRDQVRRCLR ANLLVLLTW 60 
AWAGVALGL GVSGAGGALA LGPERLS AFV FPGELLLRLL RMIILPLWC SLIGGAASLD 120 
PGALGRLGAW ALLFFLVTTL LAS ALGVGLA LALQPGAASA AINASVGAAG S AENAPSKEV 180 
LDSFLDLARN IFPSNLVSAA ERSYSTTYEE RNITGTRVKV PVGQBVEGMN BLGLWFAIV 240 
FGVA1RKLGP EGELLIRJFFN SFNEATMVLV SWIMWYAPVG IMFLVAGKIV EMEDVGLLFA 300 
RLGKYILCCL LGHAIHGLLV LPLIYFLFTR KNPYRFLWGI VTPLATAFGT SSSS A1XPLM 360 
MKCVEENNG V AKHBRFILP IG ATVNMDGA ALPQCVAAVF XAQLSQQSLD FVKHTILVT 420 
ATASSVGAAG IPAGGVLTLA nLEAVNLPV DHISLILAVD WLVDRSCTVL NVEGD ALGAG 480 
LLQNYVDRTE SRSTEFEUQ VKSELPLDFL FVPTEEGNPL LKHYRGPAGD ATVASEKESV 540 
M 



SEQ ID N0:119 PFJ3 DNA SEQUENCE 

NudefcAddAccessfont: NM_006708 

Coding sequence: 88-642 t^ndertlned sequences correspoml to start and stop codons) 
1 11 21 31 41 51 

CTAGTTAAGG CGGCACAGGG OOGAGGOGTA GTGTGGGTG A CTOCTOOGTT OCTrGGGTOC 60 
OGTOGTCTGT G ATACTGCAG TTCAGCC^Q GCAGAACCGC AGCCCCCGTC OGGOGGOCTC 120 
ACGGACGAGG CCGCCCTCAG TTGCTGCTCC GACGCGGACC CCAGTACCAA GGATTTTCTA 180 
TTOCAGCAGA OCATGCTACG AGTGAAGG AT (XTAAG AAGT CACTGGATTT TTATACTAGA 240 
GTTCTTGGAA TGACGCTAAT OCAAAAATGT GATTTTCCCA TTATG AAGTT TTCACTCTAC 300 
TTCTTGGCTT ATGAGG ATAA AAATGACATC CCTAAAGAAA AAGATGAAAA AATAGCCTGG 360 
GCGCTCTCCA GAAAAGCTAC ACTTGAGCTG ACACACAATT GGGGCACTOA AGATGATGCG 420 
AOOCAGAGTT ACCACAATGG CAATTCAG AC CCIGGAGGAT TOGGTCATAT TGGAATTOCT 480 
GTTCCTGATG TATACAGTGC TTGTAAAAGG TTTG AAG AAC TGGGAGTCAA ATTTGTGAAG 540 
AAACCTG ATG ATGGTAAAAT GAAAGGCCTG GCATTTATTC AAG AT0CTGA TGGCTACTGG 600 
ATTGAAATTT TG AATOCTAA CAAAATGGCA ACCTTAATGLAffTGCTGTGA GAATTCTCCT 660 
TTOAOATTTC AGAAGAAAGG AAACAATGTG ATTCAAG ATA TTTACATACC AOAAGCATCT 720 
AGGACTGATG GATCACTGTC CCGATTCAAA TTATTCTTCA GTCCATTTOC CCTTOCTATT 780 
TCAOCTGTTC CTTTTCACCT AACTGTTCAG TCATTCTGGT TTTCAAGCAG TGCTTT ATCT 840 
CATGTCCTTG AATATAGTTG TGTAACTTTA TTTTTTAGGT AATAATTAGA ACAGTTCCCT 900 
TCAGAGGCTG CATTTGCCTT CTTCTGCCAC CTAAATATTA CTTOCCTTCA AATCTGCCTT 960 
TGAATCATCA TTTTTAAAAA AAAATTAACA ItnTTTTOTT GTAOTTATCT TCTGGGGTTT 1020 
CAATTOCTCA GAAACAACTT TTTTCACAAC GGAAAGGAAA GAACACTAGT GTTCTTTCAG 1080 
TAAAGTACAA AGTGTTTATT TTACAAAAGA GTAGGTACTC TTGAGAGCAA TTCAAATCAT 1140 
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PCT/US01/32045 



GCTGACAAGG ATACTGATAG AAAAAGTOAT TTCTTCTTAT TATAAAGTAC ATTTAAAGTT 1200 
CAAGG ACTAA CCTTATITAT TTGGGAAAGG GG AGGAGGAA GGAAATGATA TGGTACCCAG 1260 
ACACTGGGCT AGGCTGCAAC TTTATCTCAT TTAATACTCC CAGCTGTCAT GTGAGAAAGA 1320 
r AAGCAGGCTA GGCATGTGAA ATCACTTTCA TGGATTATTA ATGGATTTAA GAGGGCATCA 1380 
D ATCAGCICAA CTCAAGATTT CATAATCATT TTTAGTATTT AGATTGTOOC TCAAAGTTGT 1440 
AGTACCTCAC AATACCTCCA CTGG 111CCT GTTGTAAAAA OCTTCAGTGA GTTTGAOCAT 1500 
TCTGCTCTTG GCTCTTGGGC TGGAGTACCG TGGTGAGGGA GTAAACACTA GAAGTCTTTA 1560 
GTACAAAACT GCTCTAGGG A CAOCTGGTG A TTCCTACACA AGTG ATGTTT ATATTTCTCA 1620 

in TAAAGAGTCT TCCCTATCOC AAGGTCTTCA TGATGCCAGT AGCCATATAT GATAAATTAT 1680 

IU GTTCAGTGAT AACTTAGTTA TCAG AAATCA GCTCAGTGGT CTTOCCOGOC ATGATTCACA 1740 
TITGATG ACT TTTTAAAAAT CAAAGTGATT TTO AAAATCT CTAATGGCTC AOAAAATAAA 1800 
AACATOCAGT TTGTGGATGA CTATATTTAG ATTTCTCTAG ACTCTAGTGG AAG AGCTTTG I860 
GAAAGGCCAT GCCAACCGTG CITGTACTGC TAG AAGCACT TTATGTITCC TTTTTGGGTG 1920 

K AAATGGATTT ATGTG AGTGC TTTAAACAAA TAGCAATACT TATAG ACTG A AATAAAATGA 1980 

I J AACTTCAAAT AAG 



20 
25 



SEQ ID N0:120 PFJ3 Protein seouencg 
Protein Accession #: NPJKI6698.1 

1 11 21 31 41 SI 
I I I I I I 

MAEPQPPSGG LTDEAALSCC SD ADPSTKDF LLQQTMLRVK DPKKSLDFYT RVLGMTLIQK 60 
CDFPIMKFSL YFLAYEDKND IPKEKDEKIA WALSRKATLE LTHNWGTEDD ATQS YHNGNS 120 
DPRGFGMGI AVFDVYSACK RFEELG VKFV KKPDDGKMKG LAFIQDPDGY WIHLNPNKM 180 
ATLM 



JO 



.0 



6 



N^tfftpftnt NM.002857 SEQ ID N0:121 PFJZ DNA SEQUENCE 

Coding sequence: 70-729 (underlined sequences correspond to start and stop codons) 



, c 1 11 21 31 41 51 

CCGACGCCAG GTC CTGO OGT CCCGCCGACC GTCCGGQ AGC GAACCCGTCG TCCCGCACTG 60 
G AGTCQ jCGAJIgG CrTCA GT GACAGATGGT AAACATGGAG TCAAAGATGC CTCTGACCAG 120 
AATTTTGACT ACATGTTTAA ACTGCTTATC ATTGGCAACA GCAGTGTTGG CAAGACCTCC 180 

WTTCCICITGC GCTATGCTGA TGACACGTTC ACCCCAGCCT TCGTTAGCAC CGTGGGCATC 240 
GACTTCAAGG TGAAGACAGT CTACCGTCAC GAGAAGCGGG TGAAACTGCA GATCTGGGAC 300 
ACAGCTGGGC AGGAGCGGTA CCGGACCATC ACAACAGCCT ATTACOGTGO GGCCATGGGC 360 
TTCATTCTG A TGTATGACAT CACCAATGAA G AGTOCTTCA ATGCTGTCCA AGACTGGGCT 420 
ACTCAGATCA AGACCTACTC CTGGGACAAT GCACAAGTTA TTCTGGTGGG GAACAAGTGT 480 
>r OACATGOAGG AAG AG AGGGT TGTTOOCACT OAGAAGOGCC AGCTCCTTGC AGAGCAGCTT 540 
VD GGGTTTGATT TCTTTGAAGC CAGTGCAAAG GAGAACATCA GTGTAAGGCA GGCCTTTGAG 600 
CGCCTGGTGG ATGCCATTTG TGACAAGATG TCTGATTCGC TGGACACAG A CCCGTCGATG 660 
CTGGGCTCCT CCAAGAACAC GCGTCTCTCG G ACACCCCAC CGCTGCTGCA GCAG AACTGC 720 
TCATGCXAgC AAGGCCCACC TTCCTGACCT CCCCTCATTG TGGCCCCACA CCCAAGTCTG 780 
CTTCTCOCTG TTACACACFG TCOGCTCT 



SEQ ID KO-.122 PFJ2 Protein seousncg 
Protein Accession #: NPJW2858.1 



'5 1 11 21 31 41 51 
I I I I I I 

MASVTDGKHG VKDASDQNFD YMFKLLIIGN SSVGKTSFLL RYADDTFTPA FVSTVGIDFK 60 
VKTVYRHEKR VKLQIWDTAG QERYRTTTTA YYRGAMGHL MYDITNEESF NAVQDWATQI 120 

. KTYSWDNAQV ILVGNKCDME EERWPTEKG QLLAEQLGFD FFEASAKENI SVRQAFERLV 180 

IU DAICDKMSDS LDTDPSMLGS SKNTRLSDTP PLLQQNCSC 



Nucleic AcM Accession!: NM.001844 . 8E0 10 N&123 PFJ1 DNA SEQUENCE 

Coding sequence: 15&4S21 (umJertined sequences comsporci to start and stop cwlons) 



n 1 11 21 31 41 51 

0 I I I I I I 

ACGCAGAGCG CTGCTGGGCT GCCGGGTCTC CCGCTTCCTC CTCCTGCTCC AAGGGCCTCC 60 
TGCATGAGGG CGOGGTAGAG ACCCGGACCC GCGCCGTGCT CCTGOOGTTT CGCTGCGCTC 120 
CGOOOGGGCC CX3GCTCAGCC AGGCCOCGCG GTGAGCCAJS ATTCGOCTCG GGGCTCCCCA 180 
GTCGCTGGTG CTGCTGACGC TGCTCGTCGC CGCTCTCCTT CGGTGTCAGG GCCAGGATCT 240 
O CCAGGAGGCT GGCAGCTGTG TGCAGGATGG GCAG AGGTAT AATGATAAGG ATGTGTGGAA 300 
GCOGGAGCCC TGCCGGATCT GTGTCTGTGA CACTGGGACT GTCCTCTGCG ACGACATAAT 360 
CTGTGAAGAC GTGAAAGACT GCCTCAGCCC TGAGATCCCC TTCGGAGAGT GCTCCCCCAT 420 
CTCCCCAACT GACCTCGCCA CTGOCAGTGG GCAACCAGGA CCAAAGGGAC AGAAAGGAGA 480 
ACCTGGAG AC ATCAAGGATA TTGTAGG AOC CAAAGGACCT CCTGGGCCTC AGGG ACCIGC 540 
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AGGGGAACA A GGACCCAGAG GGGATCGTGG TGACAAAGGT GAAAAAGGTG OCOCTGGAOC 600 
TCGTGGCAGA G ATGGAG AAC CTGGGACOOC TGGAAATCCT GGCCCCCCTG GTCCTCCCGG 660 
CCCCCCTGGT OCCOCTGGTC TTGGTGGAAA CTTTGCTGCC CAGATGGCTG GAGGATTTGA 720 
TGAAAAGGCT GGTGGCGCCC AGTTGGGAGT AATGCAAGGA OCAATGGGCC CCATGGGACC 780 
TCGAGGACCT CCAGGCCCTG CAGGTGCTCC TGGGCCTCAA GGATTTCAAG GCAATCCTGG 840 
TGAACCTGGT GAACCTGGTG TCTCTGGTCC CATGGGTCCC CGTGGTCCTC CTGGTCCXXC 900 
TGGAAAGCCT GGTGATGATG GTGAAGCTGG AAAAOCTGGA AAAGCTGGTG AAAGGGGTCC 960 
GCCKKJrCCT CAGGGTGCTC GTGGTTTCCC AGGAACCOCA GGOCTTCCTG GTGTCAAAGG 1020 
TCACAGAGGT TATCCAGGCC TGGACGGTGC TAAGGGAGAG GCGGGTGCTC CTGGTGTGAA 1080 
GGGTGAGAGT GGTTCCCCGG GTGAGAACGG ATCTCCGGGC CCAATGGGTC CTCGTGGOCT 1140 
GCCTGGTGAA AGAGGACGGA CTGGCCCTGC TGGCGCTGCG GGTGCCCGAG GCAACGATGG 1200 
TCAGCCAGGC CCCGCAGGTC CTCCGGGTCC TGTCGGTCCT GCTGGTGGTC CTGGCTTCOC 1260 
TGGTGCTCCT GGAGOCAAGG GTGAAGCCGG CCCCACTGGT GCCCOTGGTC CTGAAGGTGC 1320 
TCAAGGTCCT CGCGGTGAAC CTGGTACTOC TGGGTOCCCT GGGCCTGCTG GTGCCTCCGG 1380 
TAAOOCTGGA ACAGATGG AA TTOCTGGAGC CAAAGGATCT GCTGGTGCTC CTGGCATTGC 1440 
TGGTGCTCCT GGCTTCCCTG GGCCACGGGG TCCTCCTGGC CCTCAAGGTG CAACTGGTCC 1500 
TCTGGGCCCG AAAGGTCAGA CGGGTGAACC TGGTATTGCT GGCTTCAAAG GTGAACAAGG 1560 
CCCCAAGGG A GAACCTGGCC CTGCTGGCCC CCAGGGAGCC CCTGGACCCG CTGGTG AAGA 1620 
AGGCAAGAG A GGTGCCCGTG GAGAGCCTGG TGGCGTTGGG CCCATCGGTC CCCCTGOAGA 1680 
AAGAGGTGCT CCCGGAAACC GCGGTTTCCC AGGTCAAGAT GGTCTGGCAG GTOCCAAGGG 1740 
AGOCCCTGGA GAGCGAGGGC CCAGTGGTCT TGCTGGCCCC AAGGGAGCCA ACGGTGACCC 1800 
TGGCCGTCCT GGAGAACCTG GCCTTCCTGG AGOCCGGGGT CTCACTGGCC GCCCTGGTG A 1860 
TGCTGGTCCT CAAGGCAAAG TTGGCCCTTC TGG AGCCCCT GGTGAAGATG GTCGTCCTGG 1920 
ACCTCCAGGT CCTCAGGGGG CTCGTGGGCA GCCTGGTGTC ATGGGTTTCC CTGGCGCCAA 1980 
AGGTGCCAAC GGTGAGCCTG GCAAAGCTGG TGAGAAGGGA CTGCCTGGTG CTCCTGGTCT 2040 
GAGGGGTCTT CCTGGCAAAG ATGGTGAGAC AGGTGCTGCA GGACCCCCTG GCCCTGCTGG 2100 
ACCTGCTGGT GAACGAGGCG AGCAGGGTGC TCCTGGGCCA TCTGGGTTCC AGGGACTTCC 2160 
TGGCCCTCCT GGTCCCCCAG GTGAAGGTGG AAAACCAGGT GACCAGGGTG TTCCCGGTGA 2220 
AGCTGGAGCC CCTGGOCTGG TGGGTCCCAG GGGTGAACGA GGTTTCCCAG GTGAACGTGG 2280 
CTCTCCCGGT GCCCAGGGCC TCCAGGGTCC CCGTGGCCTC CCCGGCACTC CTGGCACTGA 2340 
TGGTCCCAAA GGTGCATCTG GCCCAGCAGG CCOCCCTGGC GCACAGGGCC CTCCAOGTCT 2400 
TCAGGGAATG CCTGGCGAGA GGGGAGCAGC TGGTATCGCT GGGCCCAAAG GCGACAGGGG 2460 
TGACGTTGGT GAGAAAGGCC CTGAGGGAGC CCCTGGAAAG GATGGTGGAC GAGGCCTGAC 2520 
AGGTCCCATT GGCCCCCCTG GCCCAGCTGG TGCTAACGGC GAGAAGGGAG AAGTTGGACC 2580 
TCCTGGTCCT GCAGGAAGTG CTGGTGCTCG TGGCGCTCCG GGTGAACGTG GAGAGACTGG 2640 
CCCCCCCGGA CCAGCGGGAT TTGCTGGGCC TCCTGGTGCT GATGGCCAGC CTGGGGCCAA 2700 
GGGTGAGCAA GGAGAGGOCG GCCAGAAAGG CG ATGCTGGT GCCCCTGGTC CTCAGGGCCC 2760 
CTCTGGAGCA CCTGGGCCTC AGGGTCCTAC TGGAGTGACT GGTCCTAAAG GAGCCCGAGG 2820 
TGCCCAAGGC CCCCCGGGAG CCACTGGATT CCCTGGAGCT GCTGGCCGCG TTGGACCCCC 2880 
AGGCTCCAAT GGCAACCCTG G ACCCCCTGG TCCCCCTGGT CCTTCTGGAA AAGATGGTCC 2940 
CAAAGGTGCT CGAGG AGACA GCGGCCCCCC TGGCCGAGCT GGTGAACCCG GCCTCCAAGG 3000 
TCCTGCTGGA CCCCCTGGCG AGAAGGG AGA GCCTGGAGAT GACGGTCCCT CTGGTGOCGA 3060 
AGGTCCACCA GGTCOCCAGG GTCTGGCTGG TCAGAGAGGC ATOGTCGGTC TGCCTGGGCA 3120 
ACGTGGTG AG AGAGGATTCC CTGGCTTGCC TGGCCCATCG GGTG AGCCCG GCAAGCAGGG 3180 
TGCTCCTGGA GCATCTGGAG ACAGAGGTCC TCCTGGCCCC GTGGGTCCTC CTGGCCTGAC 3240 
GGGTCCTGCA GGTGAACCCG OACGAGAGGG AAGCCCCGGT GCTGATGGCC CCCCTGGCAG 3300 
AGATGGCGCT GCTGGAGTCA AGGGTGATCG TGGTGAGACT GGTGCTGTGG GAGCTCCTGG 3360 
AGCCCCTGGG CCCCCTGGCT CCCCTGGCCC CGCTGGTCCA ACTGGCAAGC AAGGAGACAG 3420 
AGGAGAAGCT GGTGCACAAG GCCCC ATGGG ACCCTCAGGA CCAGCTGGAG CCCGGGGAAT 3480 
CCAGGGTCCT CAAGGCCCCA GAGGTGACAA AGG AG AGGCTGG AG AGCCTG GCGAGAGAGG 3540 
CCTGAAGGGA CACCGTGGCT TCACTGGTCT GCAGGGTCTG CCCGGCCCTC CTG G TC C 1 1C 3600 
TGGAGACCAA OGTGCTTCTG GTCCTGCTGG TCCTTCTGGC CCTAGAGGTC CTCCTGGCCC 3660 
CGTCGGTCCC TCTGGCAAAG ATGGTGCTAA TGGAATCCCT GGCCCCATTG GGCCTCCTGG 3720 
TCCCCGTGG A CG ATCAGGCG AAACCGGTCC TGCTGGTCCT CCTGGAAATC CTGGGCCCCC 3780 
TGGTCCTCCA GGTCCCCCTG GCCCTGGCAT CGACATGTCC GCCTTTGCTG GCTTAGGCCC 3840 
GAGAGAGAAG GGCCCCGACC CCCTGCAGTA CATGCGGGCC GACCAGGCAG CCGGTGGCCT 3900 
GAGACAGCAT G ACGCCGAGG TGG ATGCCAC ACTCAAGTCC CTCAACAACC AGATTGAGAG 3960 
CATCCGCAGC CCCGAGGGCT CCCGCAAGAA CCCTGCTCGC ACCTGCAGAG ACCTGAAACT 4020 
CTGCCACCCT GAGTGG AAGA GTGGAG ACTA CTGG ATTGAC OCCAACCAAG GCTGCACCTT 4080 
GGAOGCCATG AAGGTTTTCT GCAACATGG A G ACTGGCG AG ACTTGCGTCT ACCCCAATCC 4140 
AGCAAAOGTT CCCAAG AAG A ACTGGTGG AG CAGCAAGAGC AAGG AGAAG A AACACATCTG 4200 
GTTTGGAGAA ACCATCAATG GTGGCTTCCA TTTCAGCTAT GGAGATGACA ATCTGGCTCC 4260 
CAACACTGCC AACGTCCAGA TGACCTTCCT ACGCCTGCTG TCCACGGAAG GCTCCCAGAA 4320 
CATCACCTAC CACTGCAAGA ACAGCATTGC CTATCTGGAC GAAGCAGCTG GCAACCTCAA 4380 
GAAGGCCCTG CTCATCCAGG GCTCCAATGA CGTGGAGATC CGGGCAGAGG GCAATAGCAG 4440 
GTTCACGTAC ACTGCCCTGA AGGATGGCTG CACGAAACAT ACCGGTAAGT GGGGCAAGAC 4500 
TGTTATCGAG TACCGGTCAC AG AAGACCTC ACGCCTCCCC ATCATTG ACA TTGCACCCAT 4560 
GGACATAGGA GGGCCCGAGC AGGAATTCGG TGTGGACATA GGGCCGGTCT GCITCTTGIA 4620 
AAAACCTGAA OCCAGAAACA ACACAATCCG TTGCAAACCC AAAGGACCCA AGTACTTTCC 4680 
AATCTCAGTC ACTCTAGOAC TCTGCACTGA ATGGCTG ACC TGACCTGATG TCCATTCATC 4740 
CCACCCTCTC ACAGTTCGGA CTTTTCTCCC CICICTTTCT AAGAGACCTG AACTGGGCAG 4800 
ACTGCAAAAT AAAATCTCGG TGTTCTATTT ATTTATTGTC TTCCTGTAAG ACCTTCGGGT 4860 
CAAGGCAGAG GCAGG AAACT AACTGGTGTG AGTCAAATGC CCCCTGAGTG ACTGCCCCCA 4920 
GCCCAGGCCA GAAGACCTCC CTTCAGGTGC CGGGCGCAGG AACTGTGTGT GTCCTACACA 4980 
ATGGTGCTAT TCTGTGTCAA ACACCTCTGT Al 1 1*1 IT AAA ACATCAATTG ATATTAAAAA 5040 
TGAAAAGATT ATTGG AAAGT 



SEQ 10 Mftm PFJIPmtefn secuanrjg 
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Protein Accession* NPJXJ183&2 

1 11 21 31 41 51 
I I I I 1 I 

MIRLG APQSL VLLTLLVAA V LRCQGQD VQE AGSCVQDGQR YNDKDVWKPB PCRICVCDTG 60 
TVLCDDIICE DVKDCLSPEI PFGECCPICP TDLATASGQP GPKGQKGEPG DIKDIVGPKG 120 
PPGPQGPAGE QGFRGDRCDK GEKGAPGPRG RDGEPOTPGN PGPPGPFGPP GPPGLGGNFA 180 
AQMAGGFDEK AGGAQLGVMQ GPMGPMGPRG PPGPAG APGP QGFQGNPGEP GEPGVSGPMO 240 
PRGPPGPPGK PGDDGEAGKP GKAGERGPPG PQGARGFPGT PGLPGVKGHR GYPGLDGAKG 300 
EAGAPG VKGE SGSPGENGSP GPMGPRGLPG ERGRTGPAGA AGARGNDGQP GPAGPPGPVG 360 
PAGGPGFPGA PGAKGEAGPT GARGPEGAQG PRGEPGTPGS PGPAGASGNP GTDGIPGAKG 420 
S AGAPGIAGA PGFPGPRGPP GFQG ATGPLG PKGQTGEPGI AGFKGEQGPK GEPGPAGPQG 480 
• APGPAGEEGK RGARGEPGGV GPIGPPGERG APGNRGFPGQ DGLAGPKGAP GERGPSGLAG 540 
PKGANGDPGR PGEPGLPGAR GLTGRPGDAG PQGKVGPSGA PGEDGRPGPP GPQGARGQPG 600 
VMGFPGPKGA NGEFGKAOEK GLPQAPGLRG LPGKDGETGA AGPPGPAGPA GERGEOGAPG 660 
PSGFQGLPGP PGPPGEGGKP GDQGVPGEAG APGLVGPRGE RGFPGERGSP GAQGLQGPRG 720 
LPGTPGTDGP KGASGPAGPP G AQGPPGLQG MPGERGAAGI AGFKGDRGDV GEKGPEGAPG 780 
KDGGRGLTGP IGPPGPAGAN GEKGEVGPPG PAGSAGARGA PGERGETGPP GPAGFAGPPG 840 
ADGQPGAKGE QGEAGQKGDA GAPGPQGPSG APGPQGPTGV TGPKGARGAQ GPPG ATGFPG 900 
AAGRVGPPGS NGNPGPPGPP GPSGKDGPKG ARGDSGPPGR AGEPGLQGPA GPPGEKGEPG 960 
DDGPSGAEGP PGPQGLAGQR GIVGLPGQRG ERGFPGLPGP SGEPGKQGAP GASGDRGPPG 1020 
PVGPPGLTGP AGEPGREGSP GADGPPGRDG AAG VKGDRGE TGAVGAPG AP GPPGSPGPAG 1080 
PTGKQGDRGE AGAQGPMGPS GPAGARGIQG PQGPRGDKGE AGEPGERGLK GHRGFTGLQG 1140 
LPGPPGPSGD QGASGPAGPS GPRGPPGPVG PSGKDGANGI PGPIGPPGPR GRSGETGPAG 1200 
PPGNPGPPGP PGPPGPGIDM S AFAGLGPRE KGPDPLQYMR ADQAAGGLRQ HDAEVDATLK 1260 
SLNNQIESIR SPEGSRKNPA RTCRDLKLCH PEWKSGDYWI DPNQGCTLDA MKVFCNMETG 1320 
ETCVYPNPAN VPKKNWWSSK SKEKKHIWFG ETINGGFHFS YGDDNLAPNT ANVQMTFLRL 1380 
LSTEGSQNIT YHCKNSIAYL DEAAGNLKKA LUQGSNDVE IRAEGNSRFT YTALXDGCTK 1440 
HTGKWGKTVI EYRSQKTSRL PUDIAPMDI GGPEQEFGVD IGPVCFL 



SEO ID HO-.125 PFH9 DNA SEQUENCE 

Nudelc Add Accessions NM.0050S4 

Coding sequence: 162-1487{undenTned sequences correspond to start and stop codons) 

1 11 21 31 41 SI 
I I I I I I 

GCTGGTCGG A GGCTCGCAGT GCTGTCGGCG AGAAGCAGTC GGGTTTGGAG CGCTTGGGTC 60 
GGOTIGGTGC GCGGTGGAAC GOGOOCAGGG ACCOCAGTTC CCGCGAGCAG CTCCGCGCCG 120 
OGCCT GAGAG ACTAAGCTGA AACTGCTCCT CAGCTCCCAA G^3SGT GCCA CCCAAATTGC 180 
ATGTGCTTTT CTGOCTCTGC GGCTGCCTGG CTGTGGTTTA TCCrTTTGAC TGGCAATACA 240 
TAAATCCTGT TGOCC ATATG AAATCATCAO CATCGGTCAA CAAAATACAA GTACTGATGG 300 
CTGCTGCAAG CTTTGGOCAA ACTAAAATCC COGGGGGAAA TGGGCCTTAT TOOGTTGGTT 360 
GTACAGACTT AATGTTTG AT CACACTAATA AGGGGACCTT CTTGOGTTTA TATTATCCAT 420 
CCCAAGATAA TGATOGCCTT GACACCCTTT GGATCOCAAA TAAA GAATAT T TTTGGG GTC 480 
TTAGCAAATT IVi 1UGAACA CACTGGCTTA TGGGCAACAT TTTGAGGTTA CTCTTTGGTT 540 
CAATGACAA C TOCT GCAAA C TGOAATTCOC CTCT G AGG CC TGGTGAAAAA TATOCACTTO 600 
TTGTTTTTTC TCATGGTCTT GGGGCATTCA GGACACTTTA TTCTGCTATT GGCATTGACC 660 
TGGCATCTCA TGGGTTT ATA GTTGCTGCTG TAGAACACAG AGAT AGATCT GCATCTGCAA 720 
CTTACTATTT CAAGGACCAA TCTGCTGCAG AAATAGGGGA CAAGTCTTGG CTCTACCTTA 780 
GAAOCCTGAA ACAAGAGGAG GAG ACACATA TACG AAATG A GCAGGTACGG CAAAG AGCAA 840 
AAG AATGTTC OCAAGCTCTC AGTCTGATTC TTG ACATTGA TCATGG AAAG CCAGTGAAG A 900 
ATGCATTAGA TTTAAAGTTT GATATGGAAC AACTGAAGGA CTCTATTG AT AGGGAAAAAA 960 
TAGCAGTAAT TGGACATTCT TTTGGTGGAG CAACGGTTAT TCAG ACTCTT AGTGAAG ATC 1020 
AGAGATTCAG ATGTGGTATT GOOCTGGATG CATGG ATGTT TOCACTGGGT GATGAAGTAT 1080 
ATTCCAGAAT TCCTCAGCCC CTCTTTTTTA TCAACTCTQA ATATTTCCAA TATCCTGCT A 1140 
ATATCATAAA AATGAAAAAA TGCTACTCAC CTG ATAAAG A AAG AAAG ATG ATTACAATCA 1200 
GGGGTTCAGT CCACCAGAAT TTTGCTG ACT TCACTTTTGC AACTGGCAAA ATAATTGG AC 1260 
ACATGCTCAA ATTAAAGGG A OACATAGATT CAAATGT AGC TATTGATCTT AGCAACAAAG 1320 
CTTCATTAGC ATTCTTACAA AAGCATTTAG GACTTCATAA AGATTTTGAT CAGTGGGACT 1380 
GCTTG ATTG A AGGAGATGAT GAGAATCTTA TTCCAGGGAC CAACATTAAC ACAACCAATC 1440 
AACACATCAT GTTACAGAAC TCTTCAGGAA TAG AOAAATA CAATJ^GGAT TAAAATAGGT 1500 
TTTTT 



SEO ID 110:128 PFH9 Pnrtetn sequgpce; 
Protein Accession #: NP.00507&1 

1 11 21 31 41 51 

MVPPKLHVLF C3£GCLAVVY PFDWQYWV AHMKSS AWVN KIQ VLMAAAS FGQTKIPRGN 60 
GPYSVGCTDL MFDHTNKGTF LRLYYPSQDN DRLDTLWIPN KEYFWGLSKF LGTHWLMGNI 120 
LRIXPGSMTT PANWNSPLRP GEKYFLWFS HGLGAFRTLY S AIGIDLASH GFTV AAVEHR 180 
DRSASATYYF KDQSAAOGD KSWLYLRTLK QEEETHIRNE QVRQRAKECS QALSULDID 240 
HGKPVKNALD LKFDMEQLKD SIDREKIA VI GHSFGGATV1 QTLSEDQRFR CGIALDAWMF 300 
PLGDEVYSRI PQPLFFINSE YFQYPANIIK MKKCYSPDKE RKMmRGSV HQNFADFTFA 360 
TGKHGHMLK LKGDIDSNVA IDLSNKASLA FLQKHLGLHK DFDQWDCUE GDDENliPGT 420 
NINTTNQHIM LQMSSGEKY N 
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SEQ ID N 0:127 PFH8 DNA SEQUENCE 

Nudete Add Accession!: NM.01S900 

Coding sequence: 32-1402 (undefined sequences correspond to start and stop codons) 

1 U 21 31 41 51 
I I I I I I 

CACGAGCGGC ACGAGGATTT CCAGCTCAGC GAJECCCCCA GGTCCCTGGG AGAGCTGCTT 60 
CTGGGTGGGG GGOCTCATTT TGTGGCTCAG CGTTGGAAGT TCAGGGGATG CACCTOCTAC 120 
CCCACAGCCA AAGT GCGC TG ACTTOCAG AG CGOCAACCTTTTTGAAGGCA OCGATCTCAA 180 
AGTCCAGTTT CTOCTCTntJ TCCCTTCGAA TCCTAGCTGT GGGCAGCT AG TAG AAGGAAG 240 
CAGTGACCTC CAAAACTCTG GGTTCAATGC CACTCTGGGA ACCAAACTAA TTATOCATGG 300 
ATTCAGGGTT TTAGGAACAA AGCCTTCCTG GATTGACACA TTTATTAGAA CCCTTCTGCG 360 
TGCAACGAAT GCTAATGTGA TTGCXXJTGGA CTGGATTTAT GGGTCTACAG GAGTCTACTT 420 
CTCAGCTGTG AAAAATGTGA TTAAGTTGAG CCTCG AGATC TCCCnTTCC TCAATAAACT 480 
CCTGGTGCTG GGTGTGTOGG AATCCTCAAT CCACATCATT GGTGTTAGCC TGGGGGCGCA 540 
CGTTGGGGGC ATGGTGGG AC AGCTCTTCGG AGGCCAGCTG GGACAGATCA CAGGOCTGGA 600 
CCCCGCTGGA CCTG AGTACA CCAGOGCCAG TGTGGAAG AG CGCTTGGATG CTGGAGATGC 660 
CCTCTTCGTG GAAGCCATCC ACACAG ACAC CGACAATTTG GGTATTCGGA TTCCCGTTGG 720 
ACATGTGGAC TACTTOGTCA ACOGAGGCCA AOACCAAOCTGGCTGCCCCAC CTTCTTTTA 780 
CGCAGGTTAT AGTTATCTG A TCTGTG ATCA CATG AGGGCT GTGCAOC1CT ACATCAGOGC 840 
OCTGGAGAAT TCCTGTCCAC TGATGGCCTT TTXCTGTGCC AGCTACAAGG OCTTOCITGC 900 
TGGACGCTGT CTGGATTGCT TTAACCCT1T TCTO CT T ICC TGOOCAAGGA TAGGACTGGT 960 
GGAACAAGGT GGTGTCAAG A TAG AGCCGCT CCCCAAGGAA GTG AAAGTCT AOCTOCTGAC 1020 
TACTTOCAGT GCTCCGTACT GCATGCATCA CAGOCTCGTG G AGTTTCACT TGAAGG AACT 1080 
G AG AAACAAG GACACCAACA TOGAGGTTAC CTTOCTTAGC AGTAACATCA CCTCTTCATC 1140 
TAAGATCACC ATACCTAAGC AGCAACGCTA TGGGAAAGGA ATCATAGOCC ATGOCACCOC 1200 
ACAATGCCAG ATAAACCAAG TGAAATTCAA GTTTCAGTCT TCCAACCGAG TTTGGAAAAA 1260 
AGAOOGGACT AOCATTATTG GG AAGTTCTG CACTGOOCTT TTGCCTCTCA ATGACAGAGA 1320 
AAAGATGGTC TGCTTACCTG AACCAGTGAA CTTACAAGCA AGTGTGACTG TTTCCTGTGA 1380 
CCTGAAGAT A GCCTGTGT GT AGT TTAACCT GGGCAGGACA CATCTCCCTG CATTTTTTTT 1440 
TriTTTTTTT GAGAG AGAGG TGTOATQ AGO GATGTGTGTG TGCAGCTTAT TGTAGACCAT 1500 
TACTACTAAG GAGAAAAGCA AAGCTCTTTC TTATTTTDCT CATAATCAGC TACCCTGOAO 1560 
GGGA G GGAG A ACTCATTTTA CAGAACTTGG TTTOCTTTGC CGATCTTATG TACATACCCA 1620 
TTTTAGCTTT CCCATGCATA CTTAACTGCA CTTGCTTTAT CTOCTTGGGC ATTCGTACTT 1680 
AGG ATTCAAT AGAAACATGT ACAGGGTAAA CAATTTTTTA AAAATAAAAC TTCATGGAGT 1740 
AAAAAAAAAA AAAAAAAA 



SEQ ID NO:129 PFH9 PTOffll 

Protein Accession #: NP.056964.1 

1 11 21 31 41 51 
I I I I I I 

MPPGPWESCF WVGGULWLS VGSSGDAPPT PQPKCADPQS ANLFEGTDLK VQFLLFVPSN 60 
PSOGQLVEGS SDLQNSGFNA TU3TKLTJHG FR VUGTKPS W IDTHRTLLR ATNANVIAVD 120 
WIYGSTGVYF SAVKNVIKLS LEISLFLNKL LVLGVSESSI IfflGVSLGAH VGGMVGQLFG 180 
GQLGQITGLD PAGPEYTRAS VEERLDAGDA LFVEAIHTDT DNLGIRIPVG HVDYFVNGGQ 240 
DQPGCPTFFY AGYSYUCDH MRAVHLYIS A LENSCPLMAF PCAS YKAFLA GRCLDCFNPF 300 
LLSCPRIGLV EQGGVKIEPL PKEVKVYLLT TSS APYCMHH SLVEFHLKEL RNKDTNEEVT 360 
FLSSNITSSS KITIPKQQR Y GKGIIAHATP QCQINQ VKFK PQSSNRVWKK DRTT1IGKFC 420 
TALLPVNDRE KMVCLPEPVN LQASVTVSCD LKIACV 



SEQ ID NO: 129 PFH7 DNA SEQUENCE 

Nudefc Add Accession* NM_014384 

Coding sequence: 89-1 338 (undefined sequences conespond to start and stop codons) 



1 11 21 31 41 51 
I I I I I I 

CGTTGCCGGG TCGCAGGTCC CGCCAGTGCX} AGCXjCAACGG AGGTCGAAGG CGTTCAGACT 60 
CTTAGCTGAA CGCGGAGCTG CGGCGGCTAXGCTGTGGAGC GGCTGCCGGC GTTTOGGGGC 120 
GCGOCTCGGC TGCCTGCCCG GCGGTCTCCG GGTOCTCGTC CAGACCGGCC ACCGGAGCTT 180 
GAC CTCCT GC ATCGACOCTT CCATGGGACT TAATG AAGAG CAGAAAGAAT TTCAAAAAGT 240 
GGCCTTTGAC TTTGCTGCCC GAGAGATGGC TCCAAATATG GCAGAGTGGG ACCAGAAGGA 300 
GCTGTTCCCA GTGGATGTGA TGCGG AAGGC AGOCCAGCTA GGCTTCGGAG GGGTCTACAT 360 
ACAAACAG AT OTOGGCGGGT CTGGGCTGTC ACXJTC1TGAT AOCTCTGTCA TTTTTGAAGC 420 
CTTGGCTACA GGCTGCACCA GCACCACAGC CTATATAAGC ATCCACAACA TGTGTGCCTG 480 
GATGATTGAT AGCTTOGGAA ATGAGGAACA GAGGCACAAA TTTTGCCCAC CGCTCTGTAC 540 
CATGGAGAAG 1 1 lOCl IGCT ACTGCCTCAC TGAACCAGGA AGTGGGAGTG ATGCTGCCTC 600 
TCTTCTGAGC TCCGCTAAGA AACAGGG AGA TCATTACATC CTCAATGGCT CCAAGGCCTT 660 
CATCAGTGGT GCTGGTG AGT CAGACATCTA TGTGGTCATC TGCCGAACAG GAGGACCAGG 720 
CCCCAAGGGC ATCTCATGCA TAGTTGTTGA GAAGGGGACC OCTGGCCTCA GCTTTGGCAA 780 
GAAGGAGAAA AAGGTGGGGT GG AACTOCCA GCCAACACGA GCTGTOATCT TCGAAGACTG 840 
TGCTGTCCCT GTGGCCAACA GAATTGGGAG CGAGGGGCAG GGCTTCCTCA TTGCCGTGAG 900 
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AGCACTCAAC GGAGGGAGGA TCAATATTGC TTOCTCCTOC CTGGGGGCTG COCACGOCTC 960 
TGTCATDCTC ACOOGAGAOC ACCTCAATOT OOGGAAGCAG TTTGGAG AGC CTCTGGCCAG 1020 
TAAOCAGTAC TTGCAATTCA CACTGGCTGA TATGGCAACA AGGCTGGTGG COGCGCGGCT 1080 
GATCOTOCGC AATCCAGCAG TCGCTCTCCA GGAGGAGAGG AAGGATGCAG TGGOCTTOTO 1140 
CTOCATCGCC AAGCTCTTTO CTACAGATCA ATGCTTTGCC ATCTGCAACC AGGCCTTGCA 1200 
GATGCACGGG GGCTACGGCT ACCTOAAGGA TTACGCTGTT CAGCAGTACG TGCGGGACTC 1260 
CAG GGTCC AC CAGATTCTAG AAGGTAGCAA TGAAGTGATG AGGATACTGA TCTCTAGAAG 1320 
C CTGCTT CAO GAGT^gAACC CACACTTGTT CTGGCCTGGT GTTCAGTGCG ACTGCAGTCA 1380 
GTGTTGAGTG GTGCCATGTG GGCCGCTCTA TTOCAAAGGA ATCATGGATT AGACCCAAGG 1440 
GCTGAGCTCC TCTAGGGCAG GACCTGCACC CTGTGTGTTG GCAOCAGCAT OGGGTCTTGG 1500 
ACTGGGGCAG AATCCCCAGT GG AACCGG AA GAGCTGGACT GATGAGAAAC ATCAGAAGAA 1560 
CACATACTAC CTTGTTTTOC TAATGCCAG A AGGGTG ACCA GTGAAGATTC AOGGTCAAAC 1620 
CATGAAAGTC CI 1 ICTFGGA TOCACTTTAT CTTCATTAGT CTGCATTTTA CTAGTTCACT 1680 
GGATCCCTCC TCTAGGGGOC TGGGGACTTT CACTGATGCT CITOCTGATT CTAG AGCAAA 1740 
GGTGTGGGAA GGGGAAATGG AGGAATGOCC TCCTGTCTGT GTCGTTCTCT GTGCCACAGC 1800 
TAC AOAT OCA GAAGGTTTCT CTGGATAGCA CACCTCTGAA TGTAAATCAT G ATAAAATCG 1860 
ATATTTGGAA ACTTACTCCT AAGCTGTGAT GTAGGGTGTA TTTCTACTTC TGGACTGCCT 1920 
CAATATCAAG GGCTGAGACT TTTGAATGTT GAATATTCGT TGGGTTICAT GTTAAG AGGC 1980 
CTGTGGTCCA GGAGTGCTAT TCAGTGTTTC TGTTOCTO AT AAACACTTTG AATATTTTTT 2040 
TGTGTTTTTG TTTCCTTTTC TG AAGCTGTT CCTOCTTTTA AATATTTTTA ATCACATTGA 2100 
TAAAATCTAT CCTTCATOCA OCTCTGGTTC TACTATAGTT GATTTTTATT TTAAATGTTT 2160 
AATTGTATTT GATTAAACAC TTAACTGGAT TTTGGAATAA TAAAACTCTC GTCCAATTTG 2220 
GCTTTT AAAA AAAAAAAA 



SEQ ID PFH7 Pmteln sgnnwre 
Protein Accession*: NPJB5199.1 



1 11 21 31 41 51 
I I I I I I 

MLWSGCRRPG ARLGCLPGGL RVLVQTGHRS LTSCIDPSMG LNEEQKEFQK VAFDFAAREM 60 
APNMAEWDQK ELFP VDVMRK AAQLGFGGVY IQTD VGGSGL SRLDTS VIFE ALATGCTSTT 120 
AYISIHNMCA WMIDSPGNEE QRHKPCPPLC TMEKFAS YCL TEPGSGSDAA SULTS AKKQG 180 
DHYILNGSKA FISGAGESDI YWMCRTGGP GPKGISOW EKGTPGLSFG KKEKKVGWNS 240 
QPTRAVIFED CAVPVANRIG SEGQGFLIAV RGLNGGRINI ASCSLOAAHA SVBLTRDHLN 300 
VRKQFGEPLA SNQYLQFTLA DMATRLV AAR LMVRNAA VAL QEERKDA V AL CSMAKLFATD 360 
ECFAICNQAL QMHGGYGYLK DYAVQQYVRD SRVHQILEGS NEVMRHJSR SLLQR 



SEQ ID N&131 PFH6 DNA SEQUENCE 

Nudeic Acid Accession*: KM.013989 

Cotfng sequence: 707-1 105&mdef0ned sequences correspond to start and stop codons) 

1 U 21 31 41 51 
I I I I I I 

GOCTGCAGAG AGAGGCACTT TGCAOCACAG ACAG ATAGCA AGAAGGGAAA GACAGAGAGT 60 
GAGAAAAAAG AGGAGTCAGT OGCTOCTCGG GAAGGGAGAG AGTGAQACTG GGAGAAAGAG 120 
AAGCACAGAA AGTGTGTGTA AAACGGAGTA AAGAAAGAAA AAAAAAAAAC TAOOCTTAAA 180 
GCACATTTAA AAAAAAAAAA CTCTGGCAAT TCAAGAAAGA AACAGGCTAC GTTTAAAGAG 240 
CATAGAGACA ATG AAAGGCT AAAGAAAATT TTAAAATCTC TGOCACAGTC TCATAGGTGC 300 
TTGGAAATGA AAGTA GAACT GCC1G ICT IT AAOGGACTCT GACAGAGGTA ACTGGATTAO 360 
GGACGAGTAC GOCAGC1 111 1111HLL11 11 ITlTnTi TTTAACATCT TAAATCCTGA 420 
AAAAAAAAAA AAAAAAAAAA AAAAGGCAGC AGCTOCGAAT TG AATG AATT GATGGGCACA 480 
CTCCAACTGC TGGGCTGGAG AGACTGGACT TAGTCTTGCC ATTTCTGCIT CTTTGAAAGA 540 
GGAGACAACT TGGGC1 1UC1 111 A ATTTAG 11111 ITIU C CCTTCKXCC CAACOCCCAA 600 
CCTTCCCCCT TACCTCCCOC A<XCCCTTTA TCACCACCCC CXTTTTAAAT AAGAGGGTG A 660 
AGGGGAACCA GAGCGCACAA GGGAACTGAC TCAGGAGGCA GAGAAGAXGG GCATCCTCAG 720 
OGTAGACTTG CTGATCACAC TGCAAATTCT GOCA U l 1 I II 1 1C 1CCAACT GOCTCTTCCT 780 
GGCTCTCTAT GACTCGGTCA TTCTGCTCAA GCAOGTGGTG CTGCTGTTGA GCCGCTCCAA 840 
GTCCACTCGC GGAGAGTGGC GGCGCATGCT GACCTCAG AG GGACTGCGCT GOGTCTOGAA 900 
GAOC nCCTC CICGATCCCT ACAAACAGGT GAAATTGGGT GAGGATGOOC CCAATTCCAG 960 
TGTGGTGCAT GTCTCCAGTA CAGAAGGAGG TG ACAACAGT GGCAATGGTA CCCAGGAGAA 1020 
GATAGCTGAG GGAGOCACAT GOCAOCTTCT TGACTTTGCC AGOCCTG AGC GGOCACTAGT 1080 
GGTCAACTrr GGCTCAGCCA CTIQAPCTCC TTICACGAGC CAGCTGCCAG CCTTCCGCAA 1140 
ACTCGTGGAA GAGT ICTOCI CAGTGGCTGA CI ICCIUCIU GTCTACATTG ATGAGGCTCA 1200 
TCCATCAG AT GGCTGGGCG A TACCGGGGGA CTCCTCTTTG TCTTTTGAGG TGAAG AAGCA 1260 
CCAGAACCAG GAAGATOGAT GTGCAGCAGC CCAGCAGCTT CTGG AGCGTT TCT C C'l 1UCC 1320 
GCCCCAOTGC CGAGTTOTGG CTG ACCGCAT GGACAATAAC GCCAACATAG CTTACGGGGT 1380 
AGCCTTTGAA CGTGTGTGCA TTGTGCAG AG ACAGAAAATT GCTTATCTGG GAGGAAAGGG 1440 
CCCCTTCTCC TACAACCTTC AAGAAGTCCG GCATTGGCTG G AGAAGAATT TCAGCAAG AG 1500 
ATGAAAGAAA ACTAGATTAG CTGGTTAAAG GTATGATTAT AAGAGAGCTT ATTGTTTTAA 1560 
AAAGTTATAT AAAGGCAAGG AAATTAAGAA CTGAATCCAT ATTTCAACAG AGCCCTATTG 1620 
OCTTACTQAA AGACAGGAGT TTATCTATCG GAAGAACATG AATCTCTAAC AGCTCCATAC 1680 
TTCnTCACT ACTCAAATGG CATTGGGCTG AGTAAGTAAC CATATCACCT CTCTTCTTAG 1740 
TAAA AAGCCC TATGTGAAAA GATCCCAAGA TGGAGAGGAA QAAAOOCTAA TTCAGCATGT 1800 
GTICATTCTG CATTGAGAAG GAACTG ATAC ATCTGATGCA TGCTTTGAG A CCAGAAGAAA 1860 
AGACTTACCT GAATAATTAC TACATTAGGG AAGCTACTGT CTACGTTAAG ATAAAGGGTA 1920 
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TTGCCTTGGC TCTAnTGGC ATGGATGGAG OCCAGTTGGA AAATTCCCAA ATATTACAAC 1980 
AAGTOCTTG A ACOCAGGOCA TOTOGTTAO A OOTTOGTCTT AAGGTTAGAC CTTATCTTAG 2040 
AGTCATTTCT GATGTTOCAG CTTCTAGCCA TGTAGTOCTC TCAGTCTTCA TACCCCAGAA 2100 
- ATTATTGGTA TATTTGTAGA TAOOGAGAAT GATOCCTCAG TCTGAGAGGT TAGAATOATC 2160 
5 ATCTGTAATC TGAGGGTTAA TTTCTAGGCA GGTGGAGAGA GTGGTAAAAA AGAAATG AAA 2220 
TTGACAAGCr AGGAAAGAGG AGGCAG AAAG ATTTGG AAAA TTCACAG AGT TTCACCCTTA 2280 
AGCTGTAGAG AGTGGGTCAC ATTTGTTAGC CACGGAAACA TAGAAACATA CACAAGGCCA 2340 
GAAAAAG AAG AAGGAGCTCA ACTAAAAGTG GCATAGAGAA TACACATATA AAAACAATAT 2400 
ATTTGTCATA TGCTCCTAGA GAGOAGAAAG GOOTGATTGA AAG AAAAAAA AATACTTAAA 2460 

1U TATTTG TAAT 7GTGACGGGT 1 1C1 11 TOG A AATAATTACT TTTO AACCAT GTATGTGGTA 2520 
TGTATATTTT CAGTGGGTTA ATTATACCCC ATGATACCTA TTAAAGGAAA ACCAGTGGGT 2580 
CTGGTGGTGC TGGTCTTTTC CTCOCCATTC CTACAATTTC TATQTGGCCC AAGTCATTCC 2640 
T AATCTTGGT CTCTATAGCA GTGTTCTCTC TGAATGCTGA GCTGAAGAAA TTATAOGTAC 2700 
ATACACACAT ACATACATAC ATACAAATAT ATGTATATAT ATTCTCAGCT GCTGCGGG AG 2760 

1 5 GTAGGTACCA TGGCCATTCA GCACAGCCTT GATTTCCTCC CAAAGTAGGT GAGCTATAGT 2820 
GAAGAATAGG TGCAAACAAA CAAGCTTACT TCCATTGCAA AATAG AAG AA GAGGAAGTTA 2880 
GAG ATAATTC TGATCAATCA TTTTGGAGGC TTTGTTATAA GGCAACOCCC GGTATATCAT 2940 
GGAATTTCCA TTGACATTTG AATTTGGACT TGGATCTTCC C11 G U1CCCA TTAGCTGAGG 3000 
TTTAGTAATC TAAAGTCCCT ATAGTATATG ATTATAATGC TATTTTAAAA AATATATATA 3060 

ZU TAAAATATTT llllUlllli AAAATAGACA CTATAGTTTT ACCCATAAGT AATATTTAAA 3120 
GATTATAGCT COCAAAAGAA TGG ACCAAGC ACTTTOGTAT CATAATITCT TTTTGGTAAA 3180 
TATGAGACTA TTATGAAATC ATAGTATATG ATTGTATTTA AAGGTACAAT CAAAGGATCT 3240 
TTTGTCCATT CCATTAATAA CTGAATAAAA AATAAATAAA ATGGATAG AA AAAAACTAAA 3300 
GTTGAAAATA CATTCTTAAA CTAGTTGTCT G AAATG AG AA AAGAGTGAGA ACTAGGTGTG 3360 

25 CAAGAACCAA ACGTATTTTA TTTTATTTTT TAAATGGGAG CAACATATCA GTCGTGTCAC 3420 
CAGCTGGTAT ATTGTGTAAA TATTAAAGCT CCATTGGG AC TGATTTTTCA TGGCAACATC 3480 
AGCTTTCTAA TGTTCTAAAT TCTATAAAAA CCACCCACAA AGAAACAAAG CAAATTTCAT 3540 
TATCTAATGA GTTGCTGGAA AATCATATTG AG AATAATTA TTTCAGATTC CTCAGTTGTT 3600 
AACTTCTACA TTCAAGGGCT TATCTCTGGC CCCATTGATT TTTAACCTCA AAATGGTGTG 3660 

ill AGATTTACTG TGGAAOOCTA AAGCAGTAAA ATAAAAAACC TGGTTGCAGC ACATTCACAC 3720 
TGTTGTOCTT AAAATTCCOC TTTTTTCTCT ATGTACG ATA AAGTAACAGT ATGTCAG ATA 3780 
AGCCGGTGGG GGG ATG AG AT TAGGCTGAGG CAGTGCTAGT CAACTGGGGG AAAAGGATGA 3840 
TGG AAAAATC ACCCAGTTGT GCTATATTTT TAAAG AAGGA GGTOGTTTAT GTGTGCAG AC 3900 
AATTCTOCCT GAGOTTAGCC CAATGGAGAA ATGAAGCAGA GGAAGGAAAC ATAGAAAGAC 3960 

35 ATGGGCTATC AGGGAGGAAG ATGTTCAATA GAACATGCAA GAATTTCTGG AAGAAAGGCT 4020 
GTGG AAGGGC CAATGGAGAA AATGAATGG A CAAAGCTCAG GAATCCCTAC GCTATGTAG A 4080 
ATGTTCTTGG TGTTATCAGG GTTAAGCCCT GTAATTATGT AACCTATTTA TCGCAACATG 4140 
AATTTTTATG ATTTCTTGTG ATGTATTCTT TTATGAAATT AACAAGAACT CATTATTTTG 4200 
AGGTAGAGGA AAATCAATGC TTTATCTGAT ATGCTGAG AA ATTATTAG AT TGCCAATACT 4260 

40 CATGTGCGTT TCATGTGTTT TATAAGGTTT GTTOCTTTGA AGAATTGTAG TTCTTAOTCC 4320 
- CACAGGGAAA TGTGTATCTA TTTATATATC ATAGT AT AAA TCT ATGATAT ATTT ATATCA 4380 
TATATAAAAG TCTGAGTTCT CTTTCTTAGT CCCTAATCAT GTTTCTCCCA TAGGCTGTGT 4440 . 
TTACATGGAG CTATCGGTTT AGCCTTTTAA GCTTCATTAG CTTGTCTATT ATTGAAATAG 4500 
TTTCCAAGAA ATTTTAGATA TTATCATAAC ATCTGGGTCT ACTCAAACAC TTATTOTTTG 4560 

45 AAAGACTTAT GTCTTGG ACC TATCAAAAAC TGACTTTATT TATTGCTTAG TGAAAATACT 4620 
AGTGGGATCA ACAATGATTT TCTTGAATGG GCATGAATGG AGATGOCCGC ACAGTAATGT 4680 
AGAAATGTTT CATACAGCTA TTAAAATGTA ACTGACCTCC TTAGAGGCAG ATTAGTAACT 4740 
GTTCCTACTT TGTATAGCTA AGTGACAGTC ACTTAACTTA CATGACTTTC TTTTTTCACA 4800 
TTGGGTCTCT GGTCCTGTGT CTTCACCTCA TTTATX&CAC GTCICCTTGA TTTTTGGTAG 4860 

51) TATCAACTTC CCAGTGATCT GTTCAGTTAA GTTCTTCTCC GGTTAACCAG GAAGTGCTTA 4920 
TTCICICATC ACAGTGGGAA GAATAGCCTA TTGTCTTTCA TTTTGCCTGA GTGTATTTTA 4980 
CTATTTGGGC TCTG AAATAA AAATTATGAA ATATGGTGAG GTCACATGTT GGTGCTGCCT 5040 
TGCTGCATAA AATTCTAGGA GGGCAGGTTA GGAGACAGTT ATGTATGGCC TTTCGGGAAA 5100 
ATTCAAAGGG TGGGATTACA AGGGTGTTCC TCAGGCATGC CCCTATGGGC CCTATGTGGA 5160 

55 AGCAAGAAGA ATTG ACTGAT TT ACAGGACT TCTCTTT ATG TCAATCTTAA GAGGATGGAT 5220 
G AATCTGGAC ATTTGTTCCA CCCGACCTCT GACTG ATGGT TTGGAAAATA ACTTTAATTA 5280 
GGATCATATG ACCATTG AAA AAGGAAAAAT GTAGACTCTG ACTTCOGTCC CACTG AAGGA 5340 
TTAATGAAAA CCTTTACTAG CATTTAGAGC TTTTCAG AAC ATCCCCACTG TCATGTGTCT 5400 
CAGCAGTGGA GACTGCAAGT AAGGCTTTTA ATTTTAGGAG Ul 11 11 HIT 11111U1U 5460 

00 TTCCCCTAAA TGGTATGGCC AAAAGTCAGA GTTAAAATAT ATATAGTTAG ATTCGAACTT 5520 
CCTCCTTCAC TCTAAAAATA GAATCCAAAC CCACTCTTCA TATATGCTTC CAGAATGGGG 5580 
CTTAAGTACC AATCTCTGCT TTGCAATGGG CACAATCTTG GTCATGTCCT GAGGCTCTCT 5640 
AAGAAAAG AG AGG ATCTAGG ATGGGAGAGC TAG AAAGTTG CTAACTGGGA AGAACAAGGC 5700 
CCTGAGGGGT TGGTCTACCA ATCTGGGAAG ATTTGAAAAC AAACTTCTCG CAACTGAAGG 5760 

65 AAGGCTGAAG GCTGCTGCAA GTCATTGAGT GACTTTAGGA TGAGCAAAAC ATTGGGCCAC 5820 
TTCCTA ATGC CCTATGTGTA TAGTACCAGA AGCAAGGTCT CAGACTTAAC AGACCCAGCT 5880 
CTGTTCCAAG GTGAGTCTGA ACCAATAGAA AGCAAACATG TGCAG ATATC CAAACAAGAC 5940 
TGCTCATGCA AGTCGGGGCT GGCTACCCGT CTTAGGCAGC AACAGCAGAG CTCCAGGGAG 6000 
CTTATTCAAT ATTTACTGAG ACTTCGAAGA CCCAGCAGAT GTTTAATGAA GTCACTATTT 6060 

70 TGGCTCAAAC CCTCCACTTC TCCCCCTCCC CTCAAAAAGC CAACAGGTAA ACACATAAAT 6120 
GAAAG AAACC CACAG AAGGG GATGGGAAAT AAAOAAAATT CTCTCAAGAC TTCTCCAGGC 6180 
CCATGTCACT GGTCAGCGTG GTTTTTATGT GTATTAGGAT TGGGGGATGT GAAGAAATAA 6240 
GT ATCCAG T A CTTT AT AACC AAAGCAATTA AATGATATTG GGGTAGGGAA TGTTGGOCAG 6300 
TTTTGTTTAG TTTTGCCATC ACATTGTCAC CCAGACCTCA CCTAGCCCCA AGTAATCGGG 6360 

75 CGCCCCGAAG AGGGAGACAG AGATGTGCCA GAGTTGACCC AGTGTGCGGA TGATAACTAC 6420 
TGA CGAAAQ A GTCATCGACC TCAGTTAGTO GTTGGATGTA GTCACATTAG TTTGCCTCTC 6480 
CCCATCTTTG TCTCOCTGGC AAGG AGAATA TGCGGGACAT GATGCTAAGA GOCCTGGGTA 6540 
AATGTGGTGA GAATGCACGC GTGCATATGC TACACATATG TGCTTCTCAG TTGCAGAAAA 6600 
TGAACTGCTT TGGG AGATTA TCAGTAGAAA GAGTGTTATC ATATTGGTGC TGAGTGCTAT 6660 
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GTGTGCTTAT ACAATTTGTT CTTGTATTTT AATAAACTTT GAATAAAAO A ATAAAAAAAA 6720 
AAAAAAAAAA AAAAA 



SEQ TO MOriag PPHS Pmtrfn ggfflBg 
Protein Accession #: NP_054644.1 

1 11 21 31 41 51 
1 I I I I I 

MGILSVDLLI TLQILPVFFS NCUFLALYDS VTLLKHWLL LSRSKSTRGE WRRMLTSEGL 60 
RCVWKSFLLD AYKQVKLGED APNSSWHVS STEGGDNSON GTQEKIAEGA TCHLLDFASP 120 
ERPLWNPQS ATXPPFTSQL PAFRKLVEEF SSVADFLLVY IDBAHPSDGW AIPODSSLSF 180 
EYKKHQNQED RCAAAQQLLE RFSLPPQCRV V ADRMDNN AN IAYOVAFERV CIVQRQKIAY 240 
LGGKGPFSYN LQEVRHWLEK NFSKRXKKTR LAG 



SEO ID N0:133 PFH5 DNA SEQUENCE 

Nucleic Add Accession t. NM.001141 

Coding sequence: 72-2102 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

CAGGOGTGTC OCAGGGGGAG OCOOGCICTG CAGOCCTGTG OGOCGTAGAG AGCTGGACTT 60 
AGGCTGGCAG QATfiGCCGAG TTCAGGGTCA GGGTGTCCAC CGGAGAAGOC TTCGGGGCTG 120 
GCACATGGG A CAAAGTGTCT GTCAGCATCG TGGGGAOCOG GGGAGAGAGC CCOCCACTGC 180 
OCCIT3G ACAA TCTOGGCAAG GAGTICACTG CGGGCX3CTGA GGAGGACTTC CAGGTGAOGC 240 
TCCCGGAGGA CGTAGGCCGA GTGCTGCTGC TCOGCGTGCA CAAGGCGCCC CCAGTGCTGC 300 
CCCTGCTGGG GCCCCTGGCC COGGATGCCT GGTTCTGCCG CTGGTTCCAG CTGACAOOGC 360 
CGCGGGGCGG CCACCTCCTC TTCCCCTGCT AOCAGTGGCT GGAGGGGGCG GGGACCCTGG 420 
TGCTGCAGGA GGGTACAGCC AAGGTGTCCT GGGCAG ACCA CCACCCTGTG CTCCAGCAAC 480 
AGCGCCAGOA GGAGCTTCAG GCCCGGCAGG AG ATGTACCA GTGGAAGGCT TACAACCCAG 540 
GTTGGCCTCA CTGCCTGG AT G AAAAGACAG TGG AAG ACIT GGAGCICAAT ATCAAATACT 600 
CCACAGCCAA GAATGCCAAC TTTTATCTAC AAGCTGGCTC TGCTTTTGCA GAGATGAAAA 660 
TCAAGGGGTT GCTGGACCGC AAGGGGCTCT GGAGGAGTCT GAATGAGATG AAAAGGATCT 720 
TCAACTTCCG GAGGACCCCA GCAGCTGAGC ACGCATTTGA GCACTGGCAG GAGGATGCCT 780 
TCTTOGCCTC CCAGTTOCTG AATGGTCTCA ACCCTGTCCT GATCCGCCGC TGTCACTACC 840 
TDOCAAAGAA CTTCCOOGTC ACTG ATGCCA TGGTGGCCTC ATTGTTGGGT CCTGGGACCA 900 
GCTTGCAGGC TGAGCTAGAG AAGGGCTOCC TGTTCTTGGT GGATCACGGC ATCCTCTCTG 960 
GCATOCAGAC CAATGTCATT AATGGGAAGC CGCAGTTCTC TGCGGCCCCA ATGAOCCTGC 1020 
TATACCAGAG CCCAGGCTGC GCGOCGCTGC TGCCTCTCGC CATCCAGCTC AGCCAGACCC 1080 
CCGGCCCAAA CAGCCOCATC TTCCTGCCCA CTGATGACAA GTGGGACTGG TTGCTCGCCA 1140 
AGACCTGGGT GCGCAATGOC GAGTTCTOCT TCCATGAGGC CCTCACGCAC CTGCTGCACT 1200 
CACATCTGCT G CCTGAGGTC TTCACCCTGG CTACCCTGCG TCAGCTGCCC CACTGCCACC 1260 
CTCTCTTCAA GCTGCTGATC CCGCACACCC GAT ACACCCT GCACATCAAC ACACTCGCGC 1320 
GGGAGCTGCT TATCGTGCCA GGGCAGGTGG TGOACAGGTC CACAGGCATC GGCATTGAAG 1380 
GCTTCTCIX3A GTTGATACAG AGGAACATGA AGCAGCTGAA CTATTCTCTC CTGTGTCTGC 1440 
CTGAGGATAT OOGGACOCGA GGAGTTGAAG ACATCCCAGG CTACTACTAC CGTGATGATG 1500 
GGATGCAGAT TTGGGGTGCA GTGGAACGCT TTGTCTCTGA AATCATCGGT ATCTACTAOC 1560 
CAAGTGATGA GTCTGTCCAA GATGACAGAG AGCIXXAGGC CTCGGTCAGA GAGATCTTCT 1620 
OCAAGGGCTT CCTAAAOCAG GAGAGCTCAG GTATOOCTTC CTCACTGG AG ACCGGGGAAG 1680 
CCCTCGTGCA GTATGTCAOC ATGGTGATAT TCACCTGCTC AGCCAAGCAT GCGGCTGTCA 1740 
GTGCAGGGCA GTTTGACTOC TGTGCTTGG A TGCOCAAOCT GCCACCCAGC ATGCAGCTGC 1800 
CAOCAOOCAC CTOCAAAGGC CTGGCAACAT GOGAGGGCTT CATAGOCAOC CICOCAOCTG 1860 
TCAATGOCAC ATGTG ATGTC ATCCTTGCTC TCTGGTTGCT GAGCAAGGAG CCTGGAGACC 1920 
AAAGGCOOCT GGGCAOCTAT GOGG ATGAGC ACTTCACAG A GGAGGCCOCT CGGCGG AGCA 1980 
TXGCCACCXr CCAGAGCCGC CTGGCCCAGA TCTCGAGGGG CATCCAGG AG CGGAACCGGG 2040 
GCCTGGTGCT GOCCTACACC TACCTAG ACC CTCCOCTCAT OGAGAACAGC GTCTCCATCT 2100 
MATCCCAGG GGAACACAGG COCAG ATGAC ATCCCTTTGA CCACATCGCT CTAGGATAAC 2160 
TGGCACOCAG AGAAAAGGAC TCCTCAG AAA AAACAGGOCC (XATGTGOCT CTCCTGGGAC 2220 
AACCAGACTC TGTAACTCAC CCCCACCACC ATACACACAC ACAAAAACAG AAACAAAATC 2280 
AAAACAGAGA AAGCAGAAAA TCTACCAAGA ACAGAGTCTC AGGACAG AAC CACTGAGTCT 2340 
TTTGGAGOCT CCAAGCCICA AAGTGOOOGC AGAGOOCAOC TTGAGGGTTT TGCTAGTTGG 2400 
TTTTOTTTTO CGTTTACAGC CCTGGGGGGA AGCACATAAT COCOCCOCAG GGCOCACTAO 2460 
CATCCACTGA TTGGACCTTA TGGTCACCCA ACTCAAGG AC AGOCACCAAG AAGTGGCTGC 2520 
CAAAGAGACT GGGCGCAGTG GCTCATGOCC ATAATCCCAG CACTTTGGGA G ATGGAGGCG 2580 
GGAAAATCAT TTGAGGTCAG AAGTTCAAGG CCAGCCTGGA CG ACATAGCG AGACTCCAOC 2640 
TCTACCAAAA AATAAAAATT AAAAAACAAA AAAAAAAAAA AAAAA 



SEQ PNO:134PFHS Protein seouencg 
Protein Accession t. NP.001 132.1 

1 11 21 31 41 51 
I I I I I I 

MAEFRVRVST GEAPGAGTWD KVSVSIVGTR GESPPLPLDN LGKEFTAGAE EDFQVTLPED 60 
VGRVLURVH KAPPVLPLLG PLAPDAWFCR WFQLTPPRGG HLLFPCYQWL EGAGTLVLQB 120 
GTAKVSWADH HPVLQQQRQB ELQARQEMYQ WKAYNPGWFH CXDEKTVEDL ELNKYSTAK 180 
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10 
15 



NANFYLQAGS AFAEMKUCGL LORKGLWRSL NEMKRIFNFR RTPAAEHAFE HWQEDAFFAS 240 
QFLNGLNPVL IRKCHYLPKN FPVTDAMVAS LLGPGTSLQA ELEKGSLFLV DHGILSGIQT 300 
NV1NGKPQFS AAPMTLLYQS PGCGPLLPLA IQLSQTPGPN SPIFLPTDDK WDWULAKTV/V 360 
RN AEFSFHEA LTHLLH5HLL PEVFTLATLR QLFHCHPLFK LUPHTRYTL HINTLARELL 420 
IVPGQWDRS TGIGIEGFSE LIQRNMKQLN YSLLCLPEDI RTRGVEDIPG YYYRDDGMQI 480 
WGAVERFVSE IIGIYYPSDB SVQDDRELQA WVRHFSKGF LNQESSOPS SLETREALVQ S40 
YVTMVIFTCS AKHAAVSAGQ FDSCA WMPNL PPSMQLPPPT SRGLATCEGF IATLPPVNAT 600 
CDVOALWLL SKEPGDQRPL GTYPDEHFTE EAPRRSIATF QSRLAQKRG IQERNRGLVL 660 
PYTYLDPPU ENSVSI 



SEQ ID K0:13S PFH4 DNA SEQUENCE 

Nucieb Add Accession t: NMJXJ2742 

Coding sequence: . 23W974 (imderflned sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I I I I I 

GAA ITOCTTC TCICCTOCTC CTCGCCCTTC TCCTCGCCCT CCTOCTCCTC CICGCCCTCC 60 
20 CCTOCCGATC CTCATCOOCT 7GCCCTCCCC CAGOCCAGGG ACTTTTOOGG AAAOT1 ITTA 120 
1 i 1 10J G1 C1 GGGCTCTCGG AGAAAGAAGC TOCTGGCTCA GCGGCTGCAA AAC11 FCCTO 180 
CTGCCGCGCC GCCAGCCOCC GCCCTCCGCT GCCCGGOCCT GCGCCCCGCC G AGCGAT£AG 240 
CX3CCCCTCCG GTCCTGCGGC CGCCCAGTCC GCTGCTGCCC GTGGCGGCGG CAGCTGCCGC 300 
„ AGOGGOGGOC GCACTGGTOC CAGGGTOOGG GOCCGGGOCC GOGOCGTTCTTGGCTOCrGT 360 
25 CGCGGOCOOG GTCGGGGGCA TCTOG TTC CA TCTOCAGATC GGOCTGAGCC GTGAGCOGGT 420 
GCTGCTGCTG CAGGACTCGT CCGGGGACTA CAGCCTGGOG CACGTCCGCG AGATGGCTTG 480 
CTOCATTGTC GACCAGAAGT TCCCTG AATG TGGTTTCTAC GG AATGTATG ATAAG ATOCT 540 
GCTTTTTCGC CATGAOOCTA OCTCTGAAAA CATOCTTCAG CTGGTGAAAG CGGOCAGTG A 600 
„ n TATCCAGG AA GGCG ATCTTA TTGAAGTGGT CT7GTCACGT TCGGGCAGCT TTGAAGACTT 660 
30 TCAGATTCGT COOCAOGCTC 1C11 1UHU A TTCATACAGA GCTCCAGCn' lUlGTGATCA 720 

CTGTGG AGAA ATGCTGTCGG GGCTGGTAOG ICAAGGTCTT AAATGTGAAG GGTCTGGTCT 780 
GAATTACCAT AAG AGATGTG CATTTAAAAT ACOCAAGAAT TGCAGCGGTG TGAGGOGGAG 840 
AAGGCTCTCA AACGTTTCCC TCACTGGGGT GAGCACCATC OGCACATCAT CTGCTGAACT 900 
„ CTCTACAAGT GCCCCTGATG AGOCCCTTCT GCAAAAATCA CCATCAG AGT OGTTTATTGG 960 
35 TCGAGAGAAG AGGTCAAATT CTCAATCATA CATTGGACGA CCAATTCACC TTGACAAGAT 1020 
TTTCATGTCT AAAGTTAAAG TGOOGCACAC ATTTGTCATC CACTOCTACA CCCGGCCCAC 1080 
AGTGTGCCAG TACTGCAAGA AGCTTCTGAA GGGGCTTTTC AGGCAGGGCT TGCAGTGCAA 1140 
AOATTGCAGA TTCAACTGCC ATAAACGTTG TGCACCGAAA GTACCAAACA ACTGCCTTGG 1200 
CGAAGTG ACC ATTAATGGAG ATTTGCTTAG CCCTGGGGCA GAGTCTGATG TGGTCATGGA 1260 
40 AGAAGGGAGT GATGACAATG ATAGTG AAAG GAACAGTGGG CTCATGG ATG ATATGGAAG A 1320 
AGCAATGGTC CAAGATGCAG AGATGGCAAT GGCAGAGTGC CAGAACGACA GTGGCGAGAT 1380 
GCAAGATCCA GACCCAGACC ACGAGGACGC CAACAGAACC ATCAGTCCAT CAACAAGCAA 1440 
CAATATCOCA CTCATGAGGG TAGTGCAGTC TGTCAAACAC ACX5AAGAGGA AAAGCAGCAC 1500 
Ae AGTCATG AAA G AAGG ATGG A TGGTCCACTA CAOCAGCAAG GACAOGCTGC GG AAAGGGGA 1560 
45 CTATTGGAGA TTGGATAGCA AATGTATTAC CCTCTTTCAG AATGACACAG GAAGCAGGTA 1620 
CTACAAGGAA ATTOCITTAT CTGAAATTTT GTCTCTGGAA CCAGTAAAAA CITCAGCnT 1680 
AATTCCTAAT GGGGCCAATC CTCATTGTTT CGAAATCACT ACGGCAAATG TAGTGTATTA 1740 
TGTGGGAGAA AATGTGGTCA ATCCTTCCAG CCCATCACCA AATAACAGTQ TTCTCACCAO 1800 
TGGOGITGGT GCAG ATGTGG OCAGGATGTG GGAG ATAGCC ATCCAGCATG OOCTTATGGC 1860 
50 CGTCATTCCC AAGGGCTCCT CCGTGGGT AC AGGAACCAAC TTGCACAGAG AT ATCTCTGT 1920 
GAGTATTTCA GTATCAAATT GCCAGATTCA AG AAAATGTG GACATCAGCA CAGTATATCA 1980 
GATTTTTCCT GATGAAGTAC TGGGTTCTGG ACAGTTTGGA ATTGTTTATG GAGGAAAACA 2040 
TCGTAAAACA GGAAGAGATG TAGCTATTAA AATCATTGAC AAATTACGAT TTCCAACAAA 2100 
ACAAGAAAGC CAGCTTCOTA ATGAGGTTGC AATTCTACAO AACCTTCAT C ACCCTGGTGT 2160 
55 TGTAAATTTG G AGTGTATGT TTGAGACOCC TGAAAGAGTG TTTGTTGTTA TGGAAAAACT 2220 
CCATGGAGAC ATGCTGG AAA TGATCTTGTC AAGTGAAAAG GGCAGGTTGC CAGAGCACAT 2280 
AACGAAGTTT TTAATTACTC AGATACTCGT GGCTTTGCGG CACCTTCATT TTAAAAATAT 2340 
CGTTCACTGT GACCTCAAAC CAGAAAATGT GTTGCTAGCC TCAGCTGATC CTTTTCCTCA 2400 
GGTGAAACTT TGTGATTTTG GTTTTGCCCG GATCATTGGA GAGAAGTCTT TCCGGAGGTC 2460 
60 AGTGGTGGGT ACCCCCGCTT ACCTGGCTCC TGAGGTCCTA AGGAACAAGG GCTACAATCG 2520 
CTCTCT AG AC ATGTGGTCTG TTGGGGTCAT CATCTATGTA AGOCTAAGCG GCACATTCCC 2580 
ATTTAATGAA G ATG AAG ACA TACACGACCA AATTCAGAAT GCAGCTTTCA TGTATCCACC 2640 
AAATGCCTGG AAGG AAATAT CTCATG AAGC CATTGATCTT ATCAACAATT TGCTGCAAGT 2700 
- - AAAAATGAGA AAGCGCTACA GTGTGGATAA GACCTTGAGC CACXXTTGGC TACAGGACTA 2760 
65 TCAGACCTGG TTAGATTTGC GAG AGCTOG A ATGCAAAATC GGGGAGCGCT ACATCACCCA 2820 
TGAAAGTG AT GACCTG AGGT GOGAGAAGTA TGCAGGCGAG CAGCGGCTGC AGTACCCCAC 2880 
ACACCTGATC AATCCAAGTO CTAGCCACAG TG ACACTCCT G AGACTGAAG AAACAGAAAT 2940 
GAAAGCCCTC GGTGAGCGTG TCAGCATCCT CTJQAXJTTCCA TCTCCTATAA TCTGTCAAAA 3000 
„ CACTGTGG AA CTAATAAATA CATACGGTCA GGTTTAACAT TTGCCTTGCA GAACTGCCAT 3060 
70 TATTTTCTGT CAGATGAGAA CAAAGCTGTT AAACTGTTAG CACTGTTGAT OTATCTGAGT 3120 
TGCCAAGACA AATCAACAGA AGCATTTGTA TTTTGTGTGA CCAACTGTGT TGTATTAACA 3180 
AAAGTTCCCT GAAACACGAA ACTTGTTATT GTGAATGATT CATGTTATAT TTAATGCATT 3240 
AAACCTGTCT CCACTGTCCC TTTGCAAATC AGTGTTTTTC TTACTGG AGC TTCATTTTGG 3300 
TAAGAGACAG AATGTATCTG TGAAGTAGTT CTGTTTGGTG TGTCCCATTG GTGTTGTCAT 3360 
75 TGTAAACAAA CTCTTGAAGA GTCGATTATT TCCAGTGTTC TATGAACAAC TOCAAAACCC 3420 

ATGTGGGAAA AAAATGAATO AGGAGGGTAG GGAATAAAAT CCTAAG ACAC AAATGCATGA 3480 
ACAAGTTTTA ATGTATAGTT TTGAATCCTT TGCCTGCCTG GTGTGCCTCA GTATATTTAA 3540 
ACTCAAGACA ATGCACCTAG CTGTGCAAGA CCTAGTGCTC TTAAGCCTAA ATGCCTTAGA 3600 
AATGTAAACT GCCATATATA ACAGATACAT TTCCCTCTTT CTT ATAAT AC TCTGTTGT AC 3660 
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TATGG AAAAT CAGCTGCTCA GCAACCTTTC ACCTTTGTGT All 1 1 ICAAT AATAAAAAAT 3720 
ATTCTTGTCA AAAAAAAAAA AA 



SEQ ID NO1I86 PFH4 Protein secuerra 
Protein Accession f. NPJW2733J 



1 11 21 31 41 SI 
I I I I I I 

MSAPPVLRPP SPLLPV AAAA AAAAAALVPG SGPGPAPRA PVAAPVGGB FHLQIGLSRE 60 
PVIXLQDSSG DYSLAHVREM ACSIVDQKFP ECGFYGMYDK ILLFRHDPTS ENHjQLVKAA 120 
SDIQEODLIE WLSRSATFE DFQIRPHALF VHSYRAPAFC DHCGEMLWGL VRQOLKCEQC 180 
GLNYHKRCAF KIPNNCSGVR RRRLSNVSLT GVSTIRTSSA ELSTS APDEP LLQKSPSESF 240 
IGREKRSSSQ S YIGRPIHLD KILMSKVKVP HTFVIHSYTR FTVCQYCKKL LKGLFRQGLQ 300 
CKDCRFNCHK RCAPKVPNNC IjGEVTINGDL LSPG AESDW MEEGSDDNDS ERNSGLMDDM 360 
EEAMVQDAEM AMAECQNDSG EMQDPDPDHB DANRTISPST SNNIPLMRW QSVKHTKRKS 420 
STVMKEGWMV HYTSKDTLRK RHYWRLDSKC ITLFQNDTGS RYYKEIPLSB ILSLEPVKTS 480 
ALIPNGANPH CFEITTANW YYVGENWNP SSPSPNNSVL TSGVGADVAR MWEIAIQHAL 540 
MPV1PKGSSV GTGTNLHRDI SVSIS VSNCQ IQENVDISTV YQIFPDEVLG SGQFGIVYGG 600 
KHRKTGRD VA IKUDKLRFP TKQESQLRNE VAILQNLHHP GWNLECMFE TPER VFWME 660 
KLHGDMLEMI LSSERGRUPE HITKFIJTQI LVALRHLHFK NIVHCDLKFE NVLLAS ADPF 720 
PQVKLCDFGF ARUGEKSFR RSWGTPAYL APBVLRNKGY NRSLDMWSVG VHYVSLSGT 780 
FPFNEDEDIH DQIQNAAPMY PPNFWKEISH EAIDUNNLL QVKMRKRYS V DKTLSHPWLQ 840 
D YQTWLDLRE LECK1GERY1 THESDDLR WE KY AGEQRLQY FTHLINPSAS HSDTPETEET 900 
EMKALGERVSIL 



SEQ ID NOTI37 PFK3 DMA SEQUENCE 

Nucleic Acid Accession*: X9542S 

Coding sequence: 71 2-3825 (undefined sequences conespond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

AATGGTCAGT CAATACATTA T AACATAATA CACCAAATGC TAGAATAGAA GGGGAGGGGG 60 
GCACACATAA TGACTCACTG CTGGAAGAAG GGTGCATCAG TGAATTAAAA AATGTOCCTC 120 
CCCTCTTCAG CACTCAGCGC GCAGCTATTT CCTTCTGCCA GTCTCTTTGA ACTCTGGATC 180 

nroci 1 1 10 ctcoctgctc To cmi u 1 1 cattctccac A T m ciuAA Tc un:i i ic r 240 

TT AT0CTTAG CCACCCTGCT TTTTTOCTCC TTTTTTAAAA AATOGG AG AT TTCGTCTTAA 300 
AATGATTTGT CTTCCTTACC TTCGTCCATT TCAACACTGA AGGCTGCAAA GAACTTCACC 360 
TTTOCOCTAG TGGTATTTAA AAATTCTCAA TCCGTAAAAA GTCTTTTTGA AAGGCAAAGG 420 
AACAGGACCC AGACCCTCTC G ACACCCTTG ATOOGAGTCA GATCTGCACT AGCAAOCAGA 480 
ACTAATATTT CATTTAACOC ACCAAAAGGG GG AGGCGAGA GGAGCCAG AA GCAAACTTCA 540 
TCTGTCTCAG ACGG ATCCGT GGTTCCTACA TTTGGAGGAG CCGCCTGTCA GAAGGCGTAG 600 
GACOCCAAGG GGGGACAAGG AGGACTCCCG AGTCTCCCTT CTCCGCTCTC CGAGACCGAA 660 
G AGGTGG ACT GAGCOGCTCG GGACAGCGGC ACCGG AGGAG GCTCGGAGAA GATGOGGGGC 720 
TCGGGGCCCC GGGGTGCGGG ACACCGGCGG CCCCCAAGCG GCGGCGGCGA CACCCCCATC 780 
AOCCCA GCGT OOCTGGOOGO CIGCTACTCT GCACCTCGAC GGGCTCCCCT C TG G A C GTGC 840 
CTTCICCIGT GCGCCGCACT CCGGAOOCTC CTGGCCAGCC CCAGCAACGA AGTGAATTTA 900 
TTGOATTCAC GCACTGTCAT GOGGGACCTQ GGATGGATTG CI 1 1 IOCAAA AAATGGGTGG 960 
GAAGAGATTG GTG AAGTGGA TGAAAATTAT GCCOCTATCC ACACATACCA AGTATGCAAA 1020 
GTGATGGAAC AGAATCAGAA TAACTGGCTT TTGAOCAGTT GGATCTOCAA TGAAGGTGCT 1080 
TOCAGAATCT TCATAG AACT CAAATTTACC CTGCGGGACT GCAACAGOCT TCCTGGAGGA 1140 
CTGGGGAOCT GTAAGG AAAC CTTTAATATG TATTACTTTG AGTCAGATG A TCAGAATGGG 1200 
AG AAACATCA AGGAAAACCA ATACATCAAA ATTGATACCA TTGCTGCCGA TGAAAGCTTT 1260 
ACAG AACTTG ATCTTGGTGA GOGTGTTATG AAACTGAATA CAG AGGTCAG AGATGTAGGA 1320 
CCTCTAAGCA AAAA GGGATT TTATCTTGCT TTTCAAGAT G TTGGTGC1 10 CATTGCTCTG 1380 
GTTTCTGTGC GTGTATACTA TAAAAAATGC CCTTCTGTGG TACGACACTT GGCTGTCTTC 1440 
CCTGACACCA TCACTGGAGC TOATTCTTCC CAATTGCTCG AAGTGTCAGG CTCCTGTGTC 1500 
AACCATTCTG TGACCGATGA ACCTOCCAAA ATGCACTGCA GCGCCGAAGG GGAGTGGCTG 1560 
GTGCCCATCG GGAAATGCAT GTGCAAGGCA GGATATGAAG AGAAAAATGG CACCTGTCAA 1620 
GTGTGCAG AC CTGGGTTCTT CAAAGCCTCA GCTCACATGC AGAGCTGCGG CAAATGTCCA 1680 
CCTCACAGTT ATACCCATG A GGAAGCTTCA ACCTCTTGTG TCTGTGAAAA GGATTATTTC 1740 
AGGAG AG AGT CTGATCCAOC CACAATGGCA TGCACAAOAC CCCCCTCTGC TCCTCGGAAT 1800 
GCCATCTCAA ATGTTAATGA AACTAGTGTC TTTCTGGAAT GGATTOCGCC TGCTG ACACT 1860 
GGTGGAAGG A AAGACGTGTC ATATTATATT GCATGCAAGA AGTGCAACTC CCATGCAGGT 1920 
GTGTGTGAGG AGTGTGGCGG TCATGTCAGG TACCTTCCCC GGCAAAGCGG CCTGAAAAAC 1980 
ACCTCTGTCA TGATGGTGGA TCTACTCGCT CACACAAACT ATACCTTTGA GATTGAGGCA 2040 
GTGAATGGAG TGTCCGACTT GAGCCCAGGA GCCCGGCAGT ATGTGTCTGT AAATGTAACC 2100 
ACAAATCAAG CAGCTCCATC TCCAGTCACC AATGTGAAAA AAGGGAAAAT TGCAAAAAAC 2160 
AGCATCTCTT TGTCTTGGCA AGAACCAG AT CGTCCCAATG GAATCATCCT AGAGTATG AA 2220 
ATCAAGCATT TTGAAAAGGA CCAAGAGACC AGCTACACG A TTATCAAATC TAAAG AGACA 2280 
ACTATTACTG CAGAGGGCTT GAAACCAGCT TCAGTTTATG TCTTCCAAAT TCGAGCACGT 2340 
ACAGCAGCAG GCTATGGTGT CTTCAGTCGA AGATTTGAGT TTGAAACCAC COCAGTGTTT 2400 
GCAGCATCCA GCGATCAAAG CCAGATTCCT GTAATTGCTG TGTCTGTGAC AGTAGG AGTC 2460 
ATrTTGTTGG CAGTGGTTAT CGGCGTCCTC CTCAGTGG AA GTTGCTGCG A ATGTGGCTGT 2520 
GGGAGGGCTT CTTCCCTGTG CGCTGTTGCC CATCCAATOC TAAT ATGGCG GTGTGGCTAC 2580 
AGCAAAGCAA AACAAGATCC AGAAGAGGAA AAGATGCATT TTCATAATGG GCACATTAAA 2640 
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CTGCCAGGAG TAAGAACTTA CATTGATCCA CATACCTATG AGGATCCCAA TCAAGCTOTC 2700 
CACGAATTTG CCAACGAGAT AG AAGCATCA TGTATCACCA TTGAGAGAGT TATTGGAGCA 2760 
GGTG AATTTG GTGAAGTTTG TAGTGGACGT TTGAAACTAC CAGGAAAAAG AGAATTACCT 2820 
GTCGCTATCA AAACCCTTAA AGTAGGCTAT ACTGAAAAGC AACGCAGAGA TTTOCTAGGT 2880 
GAAGCAAGTA TCATGGGACA GTTTGATCAT CCTAACATCA TOCATTTAGA AGGTGTGGTG 2940 
ACCAAAAOTA AAOCAGTG AT OATOOTOACA GAGTATATGG AGAATGGCTC TTTAGATACA 3000 
TmTOAAGA AAAACGATGG GCAGTTCACT GTOATTCAGC TTGTTGGCAT GCTGAGAGGT 3060 
ATCTCTGCAG GAATGAAGTA CCTTTCTGAC ATGGGCTATG TGCATAGAGA TCTTGCTGCC 3120 
AGAAACATCT TAATCAACAG TAACCTTGTG TGCAAAGTGT CTGACTTTGG ACTTTOCCGG 3180 
GTACTGGAAG ATGATCCCGA GGCAGCCTAC ACCACAAGGG GAGGAAAAAT TCCAATCAGA 3240 
TCGACTGCCC CAGAAGCAAT AGCTTTCCGA AAGTrTACTT CTGOCAGTGA TGTCTCGAGT 3300 
TATGGAATAG TAATGTGGGA AGTTGTGTCT TATGGAGAGA GACCCTACTG GGAGATQACC 3360 
AATCAAGATO TGATTAAAGC GGTAGAGGAA GGCTATOGTC TCCCAAGOOC CATGG ATTGT 3420 
CCIGCTGCTC 7CTATCAGTT AATGCTOG AT TGCTGGCAGA AAGAGOG AAA TAGCAGGOOC 3480 
AAGTTTCATO AAATAGTCAA CATGTTGG AC AAGCTGATAC GTAACCCAAG TAGTCTGAAG 3540 
ACGCTGGTTA ATGCATCCTG CAGAGTATCT AATTTATIGG CAGAACATAG COCACTAGGA 3600 
TCTGGGGCCT ACAGATCAGT AGGTG AATGG CTAGAGGCAA TCAAG ATGGG CCGGTATACA 3660 
GAGATTTTCA TGGAAAATGG ATACAGTTCA ATCG ACGCTG TGGCTCAGGT GAOCITOGAG 3720 
GATTTCAG AC GGCTTGG AGT GACTCT7GTC GGTCACCAG A AGAAGATCAT GAACAGOCTT 3780 
CAAGAAATGA AGGTGCAGCT GGTAAACGGA ATGGTGCCAT TGTAACTTCA TGTAAATGTC 3840 
GCTTCTTCAA GTGAATGATT CTGCACTTTG TAAACAGCAC TGAGATTTAT TTTAACAAAA 3900 
AAA 



SEQONOll38£ _ 
Protein Accession #: CAA6470Q.1 

1 11 21 31 41 SI 
I I I I I I 

MRGSGPRGAG HRRPPSGGGD TPITPASLAG CYS APRRAPL WTCLLLCAAL RTLLASPSNB 60 
VNLLDSRTVM GDLGWIAFPK NGWEEIGEVD BNY APIHTYQ VCKVMEQNQN NWLLTSWISN 120 
EGASRIFIEL KFTLRDCNSL PGGLGTCKET FNMYYFESDD QNGRNIKENQ YnUDTIAAD 180 
ESFTELDLGD RVMKLNTBVR DVGPLSKKGF YLAFQDVGAC IALVSVRVYY KKCPS WRHL 240 
AVFPDTITGA DSSQLLEVSG SCVNHSVTDE PPKMHCSAEG EWLVPIGKCM CKAGYEEKNG 300 
TCQVCRPGFF KASPHIQSCG KCFFHSYTHE EASTSCVCEK DYFRRESDPP TMACTRPPSA 360 
PRNAISNVNE TS VFLEWIFP ADTG GRKDVS YYIACKKCNS HAG VCEECGG HVRYLPRQSG 420 
LKNTSVMMVD LLAHTNYTFE EAVNGVSDL SPGARQYVSV NVTTNQAAPS PVTNVKKGKI 480 
AKNSKLSWQ EPDRPNGOL EYHKHFEKD QBTSYTIIKS KETTtTABGL KPASVYVFQI 540 
RARTAAGYGV FSRRFBFETT PVFAASSDQS QDPVIAVS VT VGVHXAW1 OVLLSGSCCE 600 
CGCGRASSLC AVAHPUJWR CGYS KAKQDP EEEKMHFHNG HKLPGVRTY IDPHTVEDPN 660 
QAVHEFAKH EASOTIERV IGAGEFGBVC SGRLKLPGKR ELPVADCTUC VGYTEKQRRD 720 
FLGEASIMGQ FDHPNIIHLE GWTKSKPVM WTEYMENGS LDTFLKKNDG QFTVIQLVGM 780 
LRGISAGMKY LSDMGYVHRD LAARNQJNS NLVCKVSDFG LSRVLEDDPE AAYTIRGGKI 840 
PIRWTAPEAI AFRKFTSASD VWSYGIVMWE WSYGERPYW EMTNQDVKA VEEGYRLPSP 900 
MDCPAALYQL MLDCWQKERN SRPKFDEIVN MLDKURNPS SLKTLVNASC RVSNU-AEHS 960 
PLGSGAYRS V GEWLEAIKMG RYTEIFMENG YSSMDAVAQV TLEDLRRLGV TLVGHQKKIM 1020 
N5LQEMKVQL VNGMVPL 



SEQ ID N0:139 PFH2 DNA SEQUENCE 

Nucleic Add Accession #: NM_016029 

Coding sequence 78-1097 (underlined sequences correspond to start end stop codons) 
1 11 21 31 41 51 

I I I \ I I 

CTGCGATCOC GCAGGGCAGC GACGCGACTC TGGTOOGGGC O GTCTIC1 1C CCCCCGAGCr 60 
GOOOOTGOGC GGCCGCAAIS AACTOGGAGC TGCIX J CTCIG GCTCCTGGTG CTGTGCGCGC 120 
TGCTCCTGCT CTTGGTGCAG CTGCTGCGCT TCCTGAGGGC TGACGGCGAC CTGAOGCTAC 180 
TATGGGCCGA GTGGCAGGGA CG ACGCCCAG AATGGGAGCT GACTG ATATG GTGGTGTOGG 240 
TGACTGGAGC CTOGAGTGG A ATTGGTGAGG AGCTGGCTTA OCAGTTGTCT AAACTAGGAG 300 
TTTCTCTTGT GCTGTCAGCC AGAAGAGTGC ATGAGCTGGA AAGGGTGAAA AGAAGATGCC 360 
TAGAGAATGG CAATTTAAAA GAAAAAGATA TACTTGTTTT GCCCCTIGAC CTGACCG ACA 420 
CTGGTTCCCA TGAAGCGGCT ACCAAAGCTG TTCTCCAGG A GTTTGGTAGA ATCGACATTC 480 
TGGTCAACAA TGGTGGAATG TCCCAGCGTT CTCTGTGCAT GGATACCAGC TTGGATGTCT 540 
ACAG AAAGCT AATAGAGCTT AACTACTTAO GGACGGTGTC CTTGACAAAA TGTGTTCT GC 600 
CTCACATG AT CGAGAGG AAG CAAGGAAAG A TTGTTACTGT GAATAGCATC CTGGGTATCA 660 
TATCTQTACC tCI 1 1 O CATT GGATACTDTQ CTAGCAAGCA TOCICTOCGG G O 1 1 1 1 1 1 T A 720 
ATGGCCTTCG AACAGAACTT GCCACATACC CAGGTATAAT AGTTTCTAAC ATTTGCCCAG 780 
GACCTOTGCA ATCAAATATT GTGGAGAATT CCCTAGOGG AGAAGTCACA AAG ACTATAG 840 
GCAATAATGG AGACCAGTCC CACAAGATG A CAACCAGTCO TTOTGTGCGG CTGATGTTAA 900 
TCAGCATGGC CAATGATTTG AAAGAAGTTT GG ATCTCAGA ACAACCTTIC TTGTTAGTAA 960 
CATATTTGTG GCAATACATG CCAACCTGGG CCTGGTGGAT AACCAACAAG ATGGGGAAG A 1020 
AAAGGATTGA GAACTTTAAG AGTGGTGTGG ATGCAGACTC TTCTTATTTT AAAATCTTTA 1080 
AGACAAAACA TGACIQAAAA GAGCACCTGT ACTTTTCAAG CCACTGGAGG GAGAAATGGA 1140 
AAACATG AAA ACAGCAATCT TCTTATGCTT CTGAATAATC AAAGACTAAT TTGTGATTTT 1200 
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ACTTTTTAAT AG ATATGACT TTGCTTCCAA CATGGAATGA AATAAAAAAT AAATAATAAA 1260 
AG ATTGCCAT G AATCTTGCA AA 



SEQIPHO:140PFH2Prot^6W9^; 
Protein Accession #: NP.0571111 

1 11 21 31 41 51 
I I I I I I 

MNWELLLWLLVLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTQASS 60 
GIGEELAYQL SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 120 
ATKAVLQEPG RIDDLVNNGG MSQRSLCMDT SLDVYRKLIE LNYLGTVSLT KCVLPHMIER 180 
KQGKTVTVNS ILGIISVPLS IGYCASKHAL RGFFNGLRTE LATYPGIIVS NICPGPVQSN 240 
IVENSLAGEV TKTIGNNGDQ SHKMTTSRCV RLMLISMAND LXEVW1SEQP FLLVTYLWQ Y 300 
MPTWAWWITN KMGKKRIENF KSGVDADSSY FKIFKTKHD 



8EQD NO-.141 PFH1 DNA SEQUENCE 

Nuto Acid Accession* NMJB1614 

Coding sequence: 1-1740 (undated sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

AISAGCAGCT GCAGGTACAA CGGGGGCGTC ATGCGGCCGC TCAGCAACTT GAGCGCGTCC 60 
CGOCGGAACC TGCAOG AGAT GGACTCAGAG GCGCAGCCCC TGCAGCGCCC CGCGTCTGTC 120 
GGAGGAGGTG GCGGCGCGTC CTCCCOGTCT GCAGCCGCTG CCQCCQCGGC OGCTGTTTCG 180 
TCCTCAGCCC CCGAGATCGT GGTGTCTAAQ CCCGAGCACA ACAACTOCAA CAACCTGGCO 240 
CTCTATGGAA COGGCGGCGG AGGCAGCACT GGAGGAGGCG GCGGCGGTGG CGGGAGCGGG 300 
CACGGCAGCA GCAGTGGCAC CAAGTOCAGC AAAAAGAAAA ACCAGAACAT CGGCTACAAG 360 
CTGGGOCACC GGOGOGCOCT GTTCGAAAAG OGCAAGOGGC TCAGCGACTA CGCGCTCATC 420 
TTCGGCATGT TCGGCATCGT GGTCATGGTC ATCGAGACOG AGCTGTCGTG GGGCGCCTAC 480 
GACAAGGCGT CGCTGTATTC CTTAGCTCTG AAATGOCTTA TCAGTCTCTC CACG ATCATC 540 
CTGCTCGGTC TGATCATCGT GTACCACGCC AGGGAAATAC AGTTGTTCAT GGTGGACAAT 600 
GGAGCAGATG ACTGG AGAAT AGCCATGACT TATGAGCGTA TTTIdTCAT CTGCTTGGAA 660 
ATACTGGTGT GTGCTATTCA TCOCATAOCT GGGAATTATA CATTCACATG GACGGOOCGG 720 
CTTGCCTTCT CCTATCCCCC ATCCACAACC ACCGCTGATG TGGATATTAT TTTATCTATA 780 
CCAATGTTCT TAAGACTCTA TCTGATTGCC AGAGTCATGC TTTTACATAG CAAA C ' l 11 1C 840 
ACTG ATGCCT CCTCTAGAAG CATTGGAGCA CTTAATAAGA TAAACTTCAA TACACOTTTT 900 
GTTATGAAGA CTTTAATGAC TATATGCCCA GGAACTGTAC TCTTGGnTT TAGTATCTCA 960 
TTATGGATAA TTGCCGCATG GACTGTCCGA GCTTGTGAAA GGTAOCATGA TCAACAGGAT 1020 
GTTACTAGCA ACTTCCTTGG AGCGATGTGG TTGATATCAA TAACTTTICr CTCCATTGGT 1080 
TATGGTGACA TGGTACCTAA CACATACTGT GOAAAAGGAG TCTGCTTACT TACTGGAATT 1 140 
ATGGGTGCTG GTTGCACAGC CCTGGTGGTA GCTGTAGTGG CAAGG AAGCT AG AACTTACX 1200 
AAAGCAGAAA AACACGTGCA CAATTTCATG ATGGATACTC AGCTOACTAA AAGAGTAAAA 1260 
AATGCAGCTG CCAATGTACT CAGGGAAACA TGGCTAAnT ACAAAAATAC AAAGCTAGTG 1320 
AAAAAGATAG ATCATGCAAA AGTAAG AAAA CATCAACGAA AATTCCTGCA AGCTATTCAT 1380 
CAATTAAGAA GTGTAAAAAT GGAGCAGAGG AAACTGAATG ACCAAGCAAA CACTTTGGTG 1440 
GACTTGGCAA AGACCCAGAA CATCATGTAT GATATGATTT CTGACTTAAA CGAAAGG AGT 1500 
GAAG ACITCG AG AAGAGG AT TGTTACCCTG GAAACAAAAC TAGAGACTTT GATTGGTAGC 1560 
ATOCACGCCC TOGCTGGGCT CAT AAGOC AG A CCATCAGGC AGCAGCAG AG AGATTTCATT 1620 
GAGGCTCAGA TGG AGAGCTA CGACAAGCAC GTCACTTACA ATGCTGAGCG GTCCCGGTCC 1680 
TOGTCCAGG A GGOGGCGGTC CTCTTCCACA GCACCACCAA CTTCATCAGA GAGTAG CTAG 



SEQ ID WOi142 PFH1 Protefn senuence; 
Protein Accession #: NP.067627 

1 11 21 31 41 51 
I I I I I I 

MSSCRYNGGV MRPLSNLSAS RRNLHEMDSE AQPLQPPASV GGGGGASSPS AAAAAAAAVS 60 
SSAPEIWSK PEHNNSNNLA LYGTGGGGST GGGG GGGOSG HGSSSQTKSS KKKNQNIGYK 120 
LGHRRALFEK RKRLSD Y ALI FGMPGIWMV IETELS WGAY DKASLYSLAL KCLELSTH 180 
LLGIUVYHA RHQLFMVDN GADDWRIAMT YERIFHCLE ILVCAIHPIP GNYTFTWTAR 240 
LAFSYAPSTT TAD VDIILSI PMFLRLYUA R VMLLHSKLF TDASSRSIGA LNKINFNTRF 300 
VMKTLMTICP GTVLLVFSIS LWIIAAWTVR ACERYHDQQD VTSNFLGAMW USITFLSIG 360 
YGDMVPNTYC GKG VCLLTGI MGAGCTALW A WARKLELT KAEKHVHNFM MDTQLTKRVK 420 
NAAANVLRET WUYKNTKLV KKIDHAKVRK HQRKFLQAIH QLRS VKMEQR KLNDQANTLV 480 
DLAKTQNIMY DMISDLNERS EDFEKRTVTL ETKLETLIGS MALPGLKQ TIRQQQRDFI 540 
EAQMESYDKH VTYNAERSRS SSRRRRSSST APPTSSESS 



SEQ ID NO: 143 PfG9 DMA SEQUENCE 
Nucleic Add Accession*: All 10139, coding region Is RENESH predicted 
Cooing sequence: 1-1896 (underflnedseojuences correspond to start and stop codons) 

1 11 21 31 41 51 
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PCT7US01/32045 



I I I I I I 

AISCGCGCCG TGCCGCTGCC CGCCCCGCTC CTGOOGCTGC TGCTGCTOGC GCTOCTGGCC 60 
GCTCCCGCCG CCCGCGCCAG CAGAGCCGAG TCCGTCTCCG CGCCGTGGOC CX3AACCCGAG 120 
CGCGAGTCGC GGCCACOGCC CGGCCCGGGG CCCXJGGAACA CCACOCGGTT TGGGTCTGGG 180 
5 GCGGOGGGOG GCAGCOGCAO CTOCAGCTOC AACAGCAGTG GOGACGOCTT GGTGACCCGC 240 
ATTTCCATCC TCCTOCGCGA CCTACCCACC CTCAAGGCAG CCGTQATCGT GGCGTTCGCC 300 
TTTACCACCC TCCTCATCGC CTGOCTGCTG CTGCGCGTCT TCAGGTCGGG AAAGAGGTTA 360 
AAG AAGACAC GCAAGTATGA TATCATCAOC ACTOCAGCAG AGCGAGTGGA AATGGCGOCA 420 
CTAAATGAAG AGGATGATGA AGATGAGGAC TCCACAGTAT TCGACATCAA ATACAGAGTG 480 

10 TOCTTGOCGG CTGCACTG AG AOGTCAGCTG CCAGGGTGCC AGACGCTACT GACAGTTCCT 540 
GTGCCCCCAC CXTICATOCT CGACATTGAC CTTCCAGCAA GATGCAGTGG AAGGCCTGAT 600 
GGTGG AATCA GACCTGGTAA AACCTGTTTC CCAGCCTGGT GGCATCCTGT GGAAAGTTGG 660 
TCAGCTGCAA CCTGGGGTGT GAAGOACTGG ACCTGG AAGC CCTCTTGOGT CGG AGGTGTT 720 
GAAACCAAAA CG AACGTTAT GTATAAAACC OCAGCTCCAT CGTGCGTGTC AGGCATCTGC 780 

15 TCAGACTGTC ACTGGCAAGC TOG ITTOCAC GTCACCACAA TGGAGTTGCT TCTGCCACCC 840 
TTTGGGCATC CCTTTAAAGT GCCOCCTACT TCTACTOOCC ATGGTTTTOG ACAACTGCAG 900 
CTG AATCTCA TGG AAAAGCT GGATTCCTCT GCCTTACGCA GAAACAO00G GGCTOCATCT 960 
GCCAGGTGCT TGCCACTGGT CCTGGCAGAA ATGGCGGCTG CTGAAAGTGA CCTTCCAAAT 1020 
CCTTGGTGGC ACTTCAGCCC CACAGGCTCT CCAATAAAAA CCCTTTACAC ACAAACCATG 1080 

20 AGTACCTTGG GCTTGGATGT TTTCTGTGGT GCCGGCCAGC GGGGCACCTT TTGTG AAGAC 1140 
AG AGCAGTGA CTAAGGTTCT CCAGGGTAGC TCTTTCTCCA AACAGCTGCO CTGGAAGGCA 1200 
GCCCTAGAGA GTGGGTTTOC CCATCATCTC AGGCTTCTCA GAGAGTGTOC TOCGCTGAGC 1260 
ACOCATCCTG TCAGGTTGGC TOGTTCAG AT GOOOGGGGAC AAGCCAGCCT GACGGGG AGG 1320 

_ AGGGTGTTTC GGCGTCCGCG GCAGTCTCTG CATGGCGGAG GGTCAGCGGG TAOCGCAACT 1380 

25 TGCCTTTTGG TTTTG AAG AT TCTGTTGAGG CGCCATCCTC ACCTTGACCT CTTCTACAAA 1440 
ATCTGTCTOC OCTGCTGTGC CGTGGAACAC CTACGGGAAG GCAAG AGAAG CTCAGTGACT 1500 
GTCCTTGCGT CATTTGAGCA GAGCOCACAA AAGGCAGCFG CTGCCCACGG GGAGGCTGTC 1560 
AAACGAGGGC CCAGTGGGCA ATTGACCAGA CACACATGGC CTGGCTGGGG GATCACACAT 1620 

^ A GCGAACCTGC AGACAATTCC AGATAOCCAA GGCCAGGAAG GOCCACGTGA GGATGTCACT 1680 

30 CACCCTGGAG GAGACTTGGA TGGGGTGGCA AATTTCTATT TGGAGGAAGA GGGTTTCCAG 1740 
G ATGGCAG AT GOCAG AAG AT GGTOCTGATG TCTGAGGAAG GGCCACCTAG TTTG ACAGG A 1800 
TGTGAG AGGC TCACAGGTTC CCATCACTTC TCCAGCCATT CCAAGTCTTG GTCCTTOCTT 1860 
TCCCCCCGAC AGCCCCTGTTTCTGTGCAGG COCTGA 



35 



SEP ID WO:144 PFQ9 Protein sequence: 

Protein Accession none available, F6ENESN predicted 



- A I 11 21 31 41 51 

40 | | | | | | 

MRAVPLPAPL LPLLLLALLA APAARASRAE S VS APWPEPE RESRPPPGPG PGNTTRPGSG 60 
AAGGSGSSSS NSSGDALVTR IS1LLRDLPT LKAAVTVAFA FTTLUACLL LRVFRSGKRL 120 
KKTRKYDOT TPAERVEMAP LNEEDDEDED STVFDJKYRV SLPAALRRQL PGCQTLLTVP 180 
VPPPFILDID LPARCSGRFD GGIRPGKTCF PAWWHPVESW SAATWGVKDW TWKPSCVGGV 240 

45 ETKTNVMYKT PAPSCVSGIC SDCHWQARFH VTTMELLLPP FGHPFKVPPT STPHGFRQLQ 300 
LNLMEKLDSS ALKRNTRAPS ARCLPLVLAE MAAAESDLPN PWWHFS ATGS PDCTLYTQTM 360 
STLGLD VFCG AGQRGTPCED RA VTKVLQGS SFSKQLRWKP ALESGFPHHL RLLRECPPLS 420 
THFVRLARSD ARGQASLTGR RVFRRPRQSL HGGGSAGTAT CLLVLKILLR RHPHLDLFYK 480 

_._ ICLPCCAVEH LREAKRSSVT VLASFEQSPQ KAAAAHGEPV KRGPSGQLTR HTCPGWGITH 540 

50 ANLQTIPDTQ GQEGPREDVT HPGGDLDGVA NFYLEEEGFQ DGROQKMVLM SEEGPPSLTO 600 
CERLTGSHHF SSHSKSWSFL SPRQPLFLSR P 

SEQ ID N&145 PFGB DNA SEQUENCE 

55 NudeicAcMAccessionf: KMJ313427 

Coding sequence: 87W7B9 (ur^ert!r^seqi«»ce3c»respond to start and stopcodons) 

1 11 21 31 41 51 

60 GGCTGGGCTG CG AATAGCGT GTTCCTCTCC GGCGGAACAC ACACACCCGG CCTTGGGGCT 60 
GTCTCCTGAA GCTCCCTCCT CCACGG AG AG CGCTGAGCGC CGCCGGGAAT TCCATCCCAC 12) 
CGTGGGCACG CAGTCTTTGG AGGTOCCGGG CGCAGCACGC TCGGTGTCCC CACACTGCAG 180 
CAAGACAG AG ACCCCGCGGG AACCTTGAGC TTGGAACAAC CCTTGAGCCT CTGCAGTCGG 240 
AAGAGTGGGC GCAGCAGCCC AGCGGAGGCC AGGCGCGCAA CXTTCGGGCGC CGGGGCAAGG 300 
65 AGAGAGTGCA GGG AGGCGCA GCTCAGGCGC CCGGCTCAGG AGOGGGAGGA AGTTCTCGCG 360 
GCGCCGGGAG CGCGGTGGAC GCGCCCTGGG CGCACGCCCA GGCAGCCTTC TCCCTGGCCC 420 
TCGGGACTGT CCTCGGGCGG CAAGGAGGAG CTTGCTGGAG TCTTAGAGGC CATCCAGAGC 480 
CAGCGAGCAG GAGGGCTGOG TCTCCCGCCT CAGCTAGG AA GGGGG AGTGG CGCTGGCAGG 540 
„ CTGG AGCTGG GAACCC AGCG AGCGCCTGAC CTTCCTCCTC CTCTTCCTGA CCCTCTTCGC 600 
70 GTCTTGGGCT CCGGAGGAAG GTTCTAGCGG CTGCAGGAGG TCCCCAG ACC CATTTTCCTA 660 
GAAGGCTGGT OATGG ATCTG CTGCTCCTGC CGCCCOOGGG GCACTTGGAG CGCACCGGCG 720 
GCGCGTGAGC TGGGCTTTGC TCTCCACCGC OCTGGGCAAA GOCCGGGCCA GCCOCGOCTG 780 
GCACCTTTGC CTG AGTCCCT TTCGCTTCCC GACCCAAAGC CACCAGCGTC GAGGGAGGGA 840 
„ GGAGGAGGTG GTCCTCAGGT GCAGCCCCGC CGAGATjQTOC GCGCAG AGCC TGCTCCACAG 900 
75 CGTCTTCTCC TO 1 IIXTI C GC CCGCTTCAAG TAGCGCGGCC TCGGCCAAGG GCTTCTCCAA 960 

GAGGAAGCTG CGCCAG ACCC GCAGCCTGGA CCCGGCCCTG ATCGGCGGCT GCGGGAGCGA 1020 
CGAGGCGGGC GCGG AGGGCA GTGCGCGGGG AGCCACGGCG GGCCGCCTCT ACTCCCCATC 1080 
ACTCCCAGCC GAG AGTCTCG GOCCTCGCTT GGCGTCCTCT TCCCGGGGTC CGCCCCCCAG 1140 
GGCCACCAGG CTACCGCCTC CTGGACCTCT TTGCTCGTCC TTCTCCACAC CCAGCACCCC 1200 
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GCAGG AG AAG TCACCATCCG GCAGCTTTCA CTTTG ACTAT GAGGTTCCCC TGGGTOGOGG 1260 
CGGOCTCAAG AAG AGCATGG CCTGGG ACCT GCCTTCTGTC CTGGOCGGGC CAGCCAGTAG 1320 
CCGAAGOGCT TOCAGCATOC TCTGTTCATC CGGGGGAGGC CCCAATGGCA TCTTCGCTTC 1380 
TCCTAGGAGG TGGCTCCAGC AGAGGAAGTT CCAGTCCOCA CCCGACAGTC GCGGGCACOC 1440 
CTACGTCGTG TGGAAATCCG AGGGTGATTT CAOCTGGAAC AGCATGTCAG GCCGCAGTGT 1500 
GCGGCTGAGG TCAGTCCCCA TCCAOAOTCT CTCAOAGCTG GAGAGGGOCC GGCTGCAGGA 1560 
AGTGOCTTTT TATCAGTTGC AACAGGACFG TG ACCTGAGC TGTCAG ATCA OCATTCCCAA 1620 
AGATGGACAA AAG AGAAAGA AATCTTTAAG AAAGAAACTG GATTCACTAG GAAAOGAGAA 1680 
AAACAAAGAC AAAG AATTCA TCCCACAGGC ATTTGGAATG OOCTTATCOC AAGTCATTGC 1740 
GAATGACAGG GCCTATAAAC TCAAGCAGGA CTTGCAGAGG GACGAGCAGA AAGATGCATC 1800 
TGACTTTGTG GCTTCCCTCC TCCCATTTCG AAATAAAAGA CAAAACAAAG AACTCTCAAG 1860 
CAGTAACTCA TCTCTCAGCT CAAOCTCAG A AACACOGAAT GAGTCAACGT CCCCAAACAC 1920 
CCCGGAACCG GCTVOTCGQQ CTAGGAGGAG GGGTGCCATG TCAGTGGATT CTATCACCG A 1980 
TCTTGATGAC AATCAGTCTC GACTACTAGA AGCTTTACAA CTTTOCTTGC CTGCTGAGGC 2040 
TCAAAGTAAA AAGG AAAAAG CCAGAG ATAA GAAACTCAGT CTG AATCCTA TTTACAG ACA 2100 
GGTCCCTAGG CTGGTGGACA GCTGCTGTCA GCAOCTAGAA AAACATGGCC TCCAG ACAGT 2160 
GGGGATATTC CGAGTTGGAA GCTCAAAAAA GAGAOTGAGA CAATTACGTG AGO AATT TOA 2220 
CCGTGGGATT GATGTCTCTC TGGAGGAGGA GCACAGTGTT CATGATGTGG CAGCCTTGCT 2280 
GAAAGAGTFC CTG AGGGACA TGCCAG ACCC CCTTCTCACC AGGGAGCTGT ACACAGCTTT 2340 
CATCAACACT CICTTGTTGG AGCCGGAGGA ACAGCTGGGC ACCTTGCAGC TCCTCATATA 2400 
O CTICI A CCT OCCTOCAACT GCGACAOOCT CCACCGCCIX} CTACAGTTCC TCTCCATOGT 2460 
GGCCAGGCAT GCCGATGACA ACATCAGCAA AGATGGGCAA GAGGTCACTG GGAATAAAAT 252) 
GACATCTCTA AACTTAGCCA CCATATTTCG ACCCAACCTG CTGCACAAGC AGAAGTCATC 2580 
AGACAAAGAATTCTCAGTTC AGAGTTCAGC CCGGGCTGAG GAG AGCACGG CCATCATCGC 2640 
TGTTGTGCAA AAGATGATTG AAAATTATGA AGCCCTGTTC ATGGTTOCCC CAG ATCTOCA 2700 
GAAOGAAGTG CTGATCAGCC TGTTAG AGAC CGATCCTGAT GTOGTGGACT ATTTACTCAG 2760 
AAG AAAGGCT TOCCAATCAT CAAGCCCTGA CATGCTGCAG TCCGAAGTTT CC1 1 1 1U CGT 2820 
GGGAGGGAGG CATTCATCTA CAG ACTCCAA CAAGGCCTCC AGCGGAGACA TCTOCCCTTA 2880 
TG ACAACAAC TCCCCAGTGC TGTCTGAGCG CTCCCTGCTG GCTATGCAAG AGOACGCGGC 2940 
CCCGGGGGGC TCGG AGAAGC TTTACAGAGT GCCAGGGCAG TTTATGCTGG TGGGCCACTT 3000 
GTCGTCGTCA AAGTCAAGGG AAAGTTCTCC TGGACCAAGG CTTGGGAAAG ATCTGTCAGA 3060 
GGAGCCTTTC GATATCTGGG GAACTTGGCA TTCAACATTA AAAAGCGGAT CCAAAG ACCC 3120 
AGGAATGACA GGTTCCTCTG GAGACATTTT TGAAAGCAGC TCCCTAAGAG CGGGGCCCTG 3180 
CrCCCTTTCT CAAGGGAACC TGTCCCCAAA TTGGCCTCGG TGGCAGGGGA GCCCCGCAGA 3240 
GCTGG ACAGC GACACGCAGG GGGCTCGGAG GACTCAGGCC GCAGCCCOCO CGACGGAGGG 3300 
CAGGGCCCAC CCTGCGGTGT CGCGCGCCTG CAGCACGCCC CAOGTCCAGG TGGCAGGGAA 3360 
AGCCGAGCGG CCCACGGCCA GGTCGGAGCA GTACTTGACC CTGAGCGGCG CCCACGACCT 3420 
CAGCGAGAGT GAGCTGGATG TGGCCGGGCT GCAGAGCCGG GCCACACCTC AGTGCCAAAG 3480 
ACCCCATGGG AGTGGGAGGG ATGACAAGCG GCCCCCGCCT CCATACCCGG GCCCAGGGAA 3540 
GCCCGCGGCA GCGGCAGCCT GGATCCAGGG GCCCCCGGAA GGCGTGGAGA CACCCACGGA 3600 
CCAGGGAGGC CAAGCAGCCG AGCGAGAGCA GCAGGTCACG CAGAAAAAAC TGAGCAGCGC 3660 
CAACTCCCTG CCAGCGGGOG AGCAGGACAG TOCGCGCCTG GGGGACGCTG GCTGGCTCGA 3720 
CTGOCAGAG A G AGCGCTGGC AGATCTGGGA GCTCCTGTOG ACCGACAACC CCGATGCCCT 3780 
GCCCGAGACG CTGGTCJjQAG CCCGCACCCA GCCGAGCCCC CCCTGCCCCG AGCCCCCCGC 3840 
CCTCCAGCCC AGGGGGGACC GTGGGTGGTG GCCACTGGCA CACTTAGTOT TCTTCTTTCA 3900 
CACTTCTCAA AAGTG ACACA AGAGAAATCC AGTTCACCTA CAGAGGTAGA GCACTCACGC 3960 
CCCCGCCATT GAGAATAAGG TTCCATTGCG TAGCCAGCCT TAGGAAAAAC AAACAGAACC 4020 
CAAACCAG AT GGCAATGTOC AATCTAAAAA OGTCCCTCTT GGCTCTATAA TATAAG ATAC 4080 
AACI L 'fT G CT TGGTATAGCC TAACOGTATT T A1GTO 1C1 1CG U11 11U AC TATTGTGTAT 4140 
TCTGTAACAG ATTATGTATA ATCATATATG ATATATTCAC AAAGAGAAAA CAAAAGGAAC 4200 
TTTTAAAAAA AAAATCACTT CACTTATATT AAGCAAXG AG ATATACTAAA CAATG AG ATT 4260 
CTATAGAATG TTCTAGAATG TGCACAAGCG GGTTTCTGTG CTTTTGCCAT AGCTTTATAA 4320 
CTGGGGATAA CCCTTCCTTC GATAOCAAAC ACTAACAAGA GGAAGCAGAA TATGAGAAGC 4380 
CATATTTTTA CATAGGAGTC AGATACAAAA AG AAAAATCA CTGAATGCTT TTAGATATTG 4440 
AATACGTTTT CAGGAAAATO CTAAATCTGA TAG ATTAOGA AATATATTTT TAGAACTTGT 4500 
TTAGAAAGGA TTCAGTTAAC CAAACAAGAA AAAGGCAGTG CCTCACAAAG AAATTAAGAA 4560 
GTTGTOCGTC CCACGTTACA TCAAATTCAG TTTTATATAG GCCATATATA ATATATATTT 4620 
ATAATGTATA ATTTTTATGT ATTTTTCAAA ACTACAAACT GG AATCCAAC TATAAAGTGT 4680 
TTAAGAATCT ACACAG AATA TTCAAATTAT AGAACATGTT TTrTCCCTTT GCCCCATAAT 4740 
CAGTATTTGC CAAATTACAT GCAATTCCTT AAAAACTAAA TCACATTGGT AAAAGGCCTA 4800 
CAGCTTTGTA CTTACATTGT GCCAAAGGCT GAGG AAATGT TTTCTTTCGA ATTTTTATGT 4860 
GTATTGTAAA ATGTTCTACC GTACTTTAGT AGTTTGAAGT TTTCAAGTGC ATAACTATTT 4920 
TTGACCAGCA GAAGGCGATA CGCTTCAGTA TTTTATGCAA TTTTTTTTCA CTTCG AAGGG 4980 
AAAGTGTATT ATAAAAAAAG ATTTTTTTTT TTTAAAACAT GCTACTCTTA ATTTTCATGT 5040 
TGGTGATOAA ATTCCCAGTG GTGTTTCTTA AGGTTCTATC TTGTGCCATG ATG AATAAAA 5100 
AGTT AAGCAA AAAAAAAAAA AAAAAAAAAA AAA 



Protdn Accession*: NPJ038288.1 
1 11 21 31 41 51 

MSAQSLLH5 V FSCSSPASSS AASAKGFSKR KLRQTRSLDP AUGGCGSDE AGAEGS ARGA 60 
TAGRLYSPSL PAESLGPRLA SSSRGPPPRA TRLPPPGPLC SSJFSTPSTPQ EKSPSGSFHF 120 
DYBVPLGRGG LKKSMAWDLP SVLAGPASSR SASSILCSSG GGPNGIFASP RRWLQQRKPQ 180 
SPPDSRGHPY WWKSEGDFT WNSMSGRSVR LRSVP1QSLS ELERARLQEV PFYQLQQDCD 240 
LSCQITIPKD GQKRKKSLRK KLDSUGKEKN KDKEFIPQAF GMPLSQV1AN DRAYKLKQDL 300 
QRDBQKDASD FV ASULPPGN KRQNKELSSS NSSLSSTSET PNBSTSPNTP EPAPRARRRG 360 
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AMSVDSITDL DDNQSRLLEA LQLSLPAEAQ SKKEKARDKK LSLNPIYRQV PRLVDSCCQH 420 
LEKHGLQTVG IFRVGSSKKR VRQLREEFDR GIDVSLEEEH SVHDVAALLK EFLRDMPDPL 480 
LTRELYTAH NTLLLEPEEQ LGTLQLLIYL LPPCNCDTLH RLLQFLSIVA RHADDNISKD 540 
GQEVTGNKMT SLNLATTFOP NLLHKQKSSD KEFSVQSSAR AEESTAIIAV VQKMIENYEA 600 
LFMVPPDLQN EVUSLLETD PDWDYLLRR KASQSSSFDM LQSEV5FS VG GRHSSTDSNK 660 
ASSGDISPYD NNSPVLSERS LLAMQEDAAP GGSEKLYRVP GQFMLVGHLS SSKSRESSPG 720 
PRIjGKDLSEE PFDIWGTWHS TLKSGSKDPG MTGSSGDIFE SSSLRAGPCS LSQGNLSPNW 780 
PRWQGSPAEL DSDTQGARRT QAAAPATEGR AHPAVSRACS TPHVQVAGKA ERPTARSEQY 840 
LTLSGAHDLS ESELDVAGUQ SRATPQCQRP HGSGRDDKRP PPPYPGPGKP AAAAAWIQGP 900 
PEGVETPTDQ GGQAAEREQQ VTQKKLSS AN SLPAGEQDSP RLGDAGWLDW QRERWQIWEL 960 
LSTDNPD ALP ETLV 



SEQ ID KO:147 PFQ4 DNA SEQUENCE 

Nudefc Add Accession f: NM.002202 

Cwfing sequence: 240-1289 (undefined sequences correspond to start and stop eodons) 



1 11 21 31 41 51 
I I I I I I 

CCCCCGAGCC GCGCCGAGTC TGCCGCCGCC GCAGCGCCTC CGCTCCGCCA ACTCCGCCGG 60 
CTTAAATTGG ACTOCTAGAT OOGCGAGGGC GCGGCGCAGC CGAGCAGCGG CTCTTTCAGC 120 
ATTGGCAACC CCAGGGGCCA ATATTTCCCA CTTAGCCACA GCTCCAGCAT CCTCTCTGTG 180 
GGCTGTTCAC CAACTGTACA ACCACCATTT CACTGTGGAC ATTACTCCCT CTTACAG ATA 240 
2SGG AG ACAT GGGAGATCCA CCAAAAAAAA AACGTCTG AT TTCOCTATGT GTTGGTTGCG 300 
GCAATCAGAT TCACGATCAG TATATTCTGA GGGTTTCTCC GGATTTGGAA TGGCATGCGG 360 
CATGTTTGAA ATGTGCGGAG TGTAATCAGT ATTTGGACGA GAGCTGTACA TGCTTTGTTA 420 
GGGATGGGAA AACCTACTGT AAAAGAG ATT ATATCAGGTT GTAOGGG ATC AAATGCGOCA 480 
AGTGCAGCAT CGGCTTCAGC AAGAAOGACT TCGTGATGCG TGCCCGCTOC AAGGTGTATC 540 
ACATCGAGTO TTTCCGCTGT GTGGCCTGCA GCCGOCAGCT CATCCCTGGG GACGAATTTG 600 
CGCTTCGGG A GGACGGTCTC TTCTGCCG AG CAG ACCACGA TGTGGTGGAG AGGGCCAGTC 660 
TAGGCGCTGG CGACCOGCTC AGTCCCCTGC ATCCAGCGCG GCCACTGCAA ATGGCAGCGG 720 
AGCCCATCTC CGCCAGGCAG CCAGCCCTGC GGCCCCACGT CCACAAGCAG CCGGAGAAGA 780 
CCACCCGCGTGCGGACTGTG CTGAACGAGA AGCAGCTGCA CACCTTGCGG ACCTGCTACG 840 
CCGCAAACCC GCGGCCAGAT GOGCTCATGA AGGAGCAACT GGTAGAGATG ACGGGOCTCA 900 
GTCOCCGTGT GATCCGGGTC TGGTTTCAAA ACAAGCGGTG CAAGGACAAG AAGCG AAGCA 960 
TCATG ATGAA GCAACTCCAG CAGCAGCAGC OCAATGACAA AACTAATATC CAGGGGATGA 1020 
CAGGAACTOC CATGGTGGCT GCCAGTCCAG AGAGACACGA CGGTGGCTTA CAGGCTAACC 1080 
CAGTGGAAGT ACAAAGTTAC CAGCCACCTT GGAAAGTACT GAGCGACTTC GOCTTGCAGA 1140 
GTGACATAGA TCAQCCTGCT TTTCAGCAAC TGGTCAATTT TTCAGAAGGA GOAOCGGGCT 1200 
CTAATTOCAC TGGCAGTGAA GTAGCATCAA TGTCCTCTCA ACTTCCAG AT ACACCTAACA 1260 
GCATGGTAGC CAGTCCTATT GAGGCATGAG GAACATTCAT TCTGTATTTT TTTTCCCTGT 1320 
TGG AGAAAGT GGGAAATTAT AATGTCG AAC TCTGAAACAA AAGTATTTAA CG ACCCAGTC 1380 
AATGAAAACT GAATCAAGAA ATGAATGCTC CATGAAATGC ACG AAGTCTG TTTTAATGAC 1440 
AAGGTGATAT GGTAGCAACA CTGTG AAG AC AATCATGGGA TTTTACTAGA ATTAAACAAC 1500 
AAACAAAAOG CAAAACCCAG TATATGCTAT TCAATGATCT TAGAAGTACT GAAAAAAAAA 1560 
GACGTTTTTA AAACGTAGAG GATTTATATT CAAGG ATCTC AAAGAAAGCA TTTTCATTTC 1620 
ACTGCACATC TAGAGAAAAA CAAAAATAG A AAATTTTCTA GTCCATCCTA ATCTGAATGG 1680 
TGCTGTTTCT ATATTGGTCA TTOCCTTGCC AAACAGGAGC TCCAGCAAAA GCGCAGGAAG 1740 
AGAG ACTGGC CTCCTTGGCT GAAAGAGTCC TTTCAGGAAG GTGGAGCTGC ATTGGTTTGA 1800 
TATGTTTAAA GTTGACTTTA ACAAGGGGTT AATTG AAATC CTGGGTCTCT TGGCCTGTCC 1860 
TGTAGCTGGT TT Al ' l ' l ' l ' lT A CITTGCCCCC TCCCCA L Hi 1 1 1 lliAGATC CATCCTTTAT 1920 
CAAGAAGTCT GAAGCGACTA TAAAGGTTTT TGAATTCAG A TTTAAAAAOC AACTTATAAA 1980 
GCATTGCAAC AAGGTTACCT CTATTTTGCC AGAAGCGTCT CGGGATTGTG TTTGACTTGT 2040 
GTCTGTCCAA GAACTTTTCC CCCAAAGATG TGTATAGTTA TTGGTTAAAA TGACTGTTTT 2100 
CTCTCTCTAT GG AAATAAAA AGG AAAAAAA AAAGGAAACT TTTTTTGTTT GCTCTTGCAT 2160 
TGCAAAAATT ATAAAGTAAT TTATTATTTA TTGTCGG AAG ACTTGCCACT TTTCATGTCA 2220 
TTTGACA ITI Tl 1U1 l lGCT G AAGTGAAAA AAAAAGATAA AGGTTGTAOG GTG G ' llU 11U 2280 
AATTATATGT CTAATTCTAT GTGTTTTGTC TTTTTCTTAA ATATTATGTG AAATCAAAGC 2340 
GCCATATGTA C AATTATATC TTCAGGACT A TTTCACTAAT AAACATTTGG CATAG AT 



Sffl IP Nft 148 PFQ4 Pftfetn imim 

Protein Accession #: NPJM21&3.1 

1 11 21 31 41 51 
I I I I I I 

MGDPPKKKRL BLCVGCGNQ IHDQYILR VS PDLEWHAACL KCAECNQYLD ESCTCFVRDG 60 
KTYCKRDYIR LYGIKCAKCS IGFSKNDFVM RARSKVYHE CFRCVACSRQ UPGDEFALR 120 
EDGLFCRADH DWERASLGA GDPLSPLHPA RPLQMAAEPI SARQPALRPH VHKQPEKTTR 180 
VRTVLNEKQL HTLRTCYAAN PRPDALMKEQ LVEMTGLSPR VKVWPQNKR CKDKKRSIMM 240 
KQLQQQQPND KTNIQGMTGT PMVAASPERH DGGLQANPVE VQSYQPPWKV LSDFALQSDl 300 
DQPAFQQLVN FSEGGPGSNS TGSEVASMSS QLPDTPNSMY ASPBA 
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SEQ ID N&149 PFQ2 DNA SEQUENCE 

Nucleic Add Accession #: NMJJ01172 

Coding sequence: 35-1103 (undefflned sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

GCGGAGCTCT QCCTTGGAQA TTCTCAGTGC TGCCH3 ATCATJJITCCTAAGG GGCAGOCTCT 60 
CGCGTCTCCT CCAG ACGCGA GTGCATTCCA TCCTG AAG AA ATCCGTCCAC TCCGTGGCTG 120 
TGATAGG AGC OOOGTTCTCA CAAGGGCAGA AAAGAAAAGG AGTGGAGCAT GGTCCCGCTG 180 
CCATAAGAGA AGCTGGCTTG ATG AAAAGGC TCTCCAGTTT GGGCTGCCAC CTAAAAGACT 240 
TTGGAGATTT GAGTTTTACT CCAGTCCCCA AAGATGATCT CTACAACAAC CTGATAGTGA 300 
ATCCACGCTC AGTGGGTCTT GCCAACCAGG AACTGGCTGA GGTGGTTAGC AGAGCTOTGT 360 
CAGATGGCTA CAGCTGTGTC ACACTGGGAG GAG ACCACAG CCTGGCAATC GGTAOCATTA 420 
GTGGCCATGC CCGACACTGC CCAGACCTTr GTGTTGTCTO GGTTQATGCC CATGCTGACA 480 
TCAACACACC CCTTAGCACT TCATCAGGAA ATCTCCATGG ACAGOCAGTT TCATTTCTOC 540 
TCAGAGAACT ACAGG ATAAG GTACCACAAC TCCCAGGATT TICCTGGATC AAACCTTGTA 600 
TCTCTTCTGC AAGTATTGTG TATATTGGTC TG AG AGACGT GGACCCTGCT GAACATTTTA 660 
TTTTAAAGAA CTATGATATC CAGTATTTTT CCATGAGAGA TATTGATCGA CTTGGTATCC 720 
AGAAGGTCAT GGAACGAACA TTTGATCTGC TGATTGGCAA GAGACAAAGA OCAATOCATT 780 
TGAGTn I G A TATTGATGCA TTTGACCCTA CACTGGCTOC AGCCACAGGA ACTCCTGTTG 840 
TCGGGGGACT AACCTATCG A G AAGGCATGT ATATTGCTGA GGAAATACAC AATACAGGGT 900 
TGCTATCAGC ACTGGATCTT GTTGAAGTCA ATCCTCAGTT GGCCACCTCA GAGGAAGAGG 960 
CGAAGACTAC AGCT AACCTG GCAGTAGATG TG ATTGCTTC AAGCTTTGGT CAGACAAG AG 1020 
AAGG AGGGCA TATTGTCTAT GACCAACTTC CTACTOCCAG TTCACCAGAT GAATCAGAAA 1080 
ATCAAGCACG TGTG AGAATT JAGGAGACAC TGTGCACTGA CATGTTTCAC AACAGGCATT 1140 
GCAGAATTAT GAGGCATTGA GGGGATAGAT GAATACTAAA TGGTTGTCTG GGTCAATACT 1200 
GOCTTAATGA G AACATTTAC ACATTCTCAC AATTGTAAAG TXTOCCCTCT ATTTTGGTGA 1260 
CCAATACTAC TGTAAATGTA TTTGGIUT1 TGCAGTTCAC AGGGTATTAA TATGCTACAG 1320 
TACTATGTAA ATTTAAAG AA GTCATAAACA GCATTTATTA OCTTGGTATA TCATACTGGT 1380 
CTTGTTGCTG TTGTTCCTTC ACATTTAAGT GGTTTTTCAT CTTTOCTCOC TCCTCOCACA 1440 
GOCTGGCTAT ACAGTGCATC CTTGAACTGT CAGCCCACAG CAGCAATATG CTTATTCTAT 1500 
CCACATCCCT AACATCATGC ATTCACAAGG TCAAAGTTCT GGTOCACAAA COCTTCCCTA 1560 
TAGAAGTTCA ATGGCTGCGA AAGAATTTGT AGTAAACCAG GCCTCCCAGG ATGGCGAGCT 1620 
CCAGTAAGAT GATAATGGAA AGCAGCAGCT TGTTGGTTGT CACTCTACAA AG AGAAGCAA 1680 
AGTGGGGAGT AGTCAGAAGT TTGGATAAOC TTCCTTCTAA ACATTTGGGG GTTAGAOCTG 1740 
GGAGCAOGGC TGG ATACTCT GAGGCTGTAT GTTTGATCAC ACAGCCACTT AGCAGGAAGT 1800 
ACTCATAAGG TTCTTTAGCT GTCACTTAGG GATAACACTG TCTAOCTCAC AGAAATGTTA 1860 
AACTG AG ACA ATAAAACCCA AAGCAT 



Protein Accession* NP.001163.1 
1 U 21 31 41 51 

I ! I I I I 

MSLRGSLSRL LQTRVHSILK KSVHSVAVIG APFSQGQKRK GVBHGPAAIR EAGLMKRLSS 60 
LGCHLKDPGD LSFTPVPKDD LYNNUYNPR S VGLANQELA EWSRAVSDG YSCVTLGGDH 120 
SLAIGTBGH ARHCPDLCW WVD AHADINT PLTTSSGNLH GQPVSFLLRE LQDKVPQLPG 180 
FSWKPOSS ASIVYIGLRD VDPPEHFILK NYDIQYFSMR DIDRLGIQKV MERTFDLLIG 240 
KRQRPIHLSF DID AFDPTLA PATGTP WGG LTYREGMYIA EEtHNTGLLS ALDLVEVNPQ 300 
LATSEEEAKT TANLA VD VIA SSFGQTREGG HTVYDQLPTP SSPDESENQA RVRI 



SEQ ID K0:151 PFG1 DNA SEQUENT 

Nudete Add Accession*: NM_017906 

Cooing sequence: 60-1255 (underflned sequences correspond to start and stop codons) 
1 11 21 31 41 51 

I I I I I I 

AATTATATAT TTTTACTCTA IU1 1 1C1U A CA 1U1 11 11T ICi 1 1CC GTTGCTGGCGGAA 60 
GAGGCAOQTG OGCTGCTG AA TGO AGCTOGT CGCTGGTTGC TAOGAGCAGG TOCTCTTTOO 120 
GTTCGCTGTA CACOGGGAGC CCAAGGCTTG CGGCX3 ACCAC GAGCAATGGA CTCTTGTGGC 180 
TGACTTCACT GAGCATGCTC ACACTGCCTC CTTGTCAGCA GTAGCTGTAA ATAGTOGTTT 240 
TGTGGTCACT GGGAGCAAAG ATGAAACAAT TCACATTTAT GACATG AAAA AGAAGATTG A 300 
GCATGGGGCT CTAGTGCATC ACAGTGGTAC AATAACTTGC CTGAAATTCT ATGGCAACAG 360 
GCATTTAATC AGTGGAGCGG AAGATGGACT CATCTGTATC TGGGATGCAA AGAAATGGG A 420 
ATGCCTGAAG TCAATTAAAG CTCACAAAGG ACAGGTGACC TTOCTTTCTA TTCACCCATC 480 
TGGCAAGTTG GOCCTGTCGG TTGGTACAGA TAAAACTTTA AG AACGTGGA ATCTTGTAGA 540 
AGGAAGATCA GCATTCATAA AAAATATAAA ACAAAATGCT CACATAGTAG AATGGTCCCC 600 
AAGAGGAGAG CAGTATGTAG TTATCATACA GAATAAAATA GACATCTATC AGCTTGACAC 660 
TGCATOCATT AGTGGCACCA TCACAAATGA AAAGAGAATT TCCTCTGTTA AATTTCTTTC 720 
AGAGTCTGTC CTTGCAGTGG CTGG AGATGA AGAAGTTATA AGGTTTTTTG ACTGTOATTC 780 
ACTAGTGTGC CTCTGCGAAT TTAAAGCTCA TGAAAACAGG GTAAAGG ACA TGTrCAGTTT 840 
TG AAATTCCA G AOCATCATG TTATTGTTTC AGCATDGAGT GATGGTTTCA TCAAAATGTG 900 
GAAGCTTAAG CAGGATAAGA AAGTTOOOOC ATCTTT ACTC TGTGAAATAA ACACTAATGC 960 
CAGGCTGACG TGTCTTGGAG TGTGGCT AG A CAAAGTGGCA GACATGAAAA GCCTTOCTCC 1020 
AGCTGCAG AG CXriTCTCCTG TAAGTAAAGA ACAGTCCAAA ATTGGCAAAA AGGAGCCTGG 1080 
TGACACAGTG CACAAAGAAG AAAAGCGGTC AAAAOCTAAC ACAAAG AAAC GCOOTTTAAC 1140 
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AGGTG ACAGT AAGAAAGCAA CAAAAGAAAC TGGCCTGATA TCAACCAAGA AGAGGAAAAT 1200 
GGTAGAAATG TTGGAAAAGA AGAGG AAAAA GAAGAAAATA AAAACAATGC AGTGAATCAC 1250 
AGATCTCT CC TGAAAGAACT CTTTTAGATG AAATCATTCT ACTCAAATGT ACCTTAATTT 1320 
11111111CC CTGAGTAAAA GCAAGAAATT TCTTCCTTTG GAAAAAATAT ATATATTAAA 1380 
AAACCACTTT TAGATGGTTT TTTTTAAAAA AAAAAAAAAA ACTGGTAAAA TTACTTTTGG 1440 
CAGACAGTGT TTTATGAATT ATGTATCATG TTGATATATA ATATGTTAAT GTGTCATGTA 1500 
ATTTTTACTT TGTACAAAGC AAATAAAG AT CTTTCTCAAA AAAAAAAAAA AAAA 



SEQ IP Nft1 B PFS1 Proton swroe; 
Protein Accession #: NP.060376.1 

1 11 21 31 41 51 
I I I I I I 

MELVAGCYEQ VLFCFAVHPE PKACGDHEQW TLVADFTHHA HTASLSAVAV NSRFWTGSK 60 
DBTIHIYDMK KKIEHGALVH HSGTTTCLKF YGNRHLISGA EDGUCIWDA KKWECLKSIK 120 
AHKGQVTFLS IHPSGKLALS VGTDKTLRTW NLVEGRSAH KNIKQNAHTV EWSPRGEQYV 180 
VDQNKIDIY QLDTAS ISGT 1TNEKRISSV KJFLSESVLA V AGDEEV1RFF DCDSLVCLCE 240 
FKAHENRVKD MFSFHPEHH VIVSASSDGF DCMWKLKQDK KVPPSLLCH NTNARLTOjG 300 
VWLDKVADMK SLPPAAEPSP VSKEQSKIGK KEPGDTVHKE EKRSKPNTKK RGLTGDSKKA 360 
TKESGUSTK KRKMVEMLEK KRKKKKHCTM Q 



S£Q ID N&153 PFD6 DNA SEQUENCE 

NucteteAddAccesdon*: NM_014668 

Cotfno sequence: 1102953 (underfined sequences correspond to start and stop codore) 

1 11 21 31 41 51 
I I I I I I 

GATGTCTTGG ACATGCTCTG GCTGGCTAAT CTOCATGTTC TAGCCGACTG AAAATAOGGT 60 
GGCCAAGTGG ATGGTGTGCT TATTTGCAGT CTAAAGAAAT TTCCTTTT G A TGT GGCAGAA 120 
AATCGAGGAT GTGGAGTGGA GACCOCAGAC TTACTTGGAG CTGGAGGGTC TGCCTTGCAT 180 
CCTGATCTTC AGTGCGATGG ACCCGCATGG GGAGTCCTTG COGAG G TCIT TGAGGTACTG 240 
TGACCTGCGA TTGATAAACT CCTCCTGCTT GGTGAGAACA GCCTTGGAGC AGG AGCTGGG 300 
CCTGGCTGCC TACTTTGTG A GCAACGAGGT TCOCTTGGAG AAGGGGGCTA GG AAOG AGGC 360 
CTTOGAGAGT GATGCTGAGA AGCTGAGCAG CACAGACAAC GAGG ATGAGG AGCTGGGGAC 420 
AGAAGGCTCT ACCTCGGAG A AGAGAAGCOC CATGAAAAGG GAGAGGTCCC GCTCCCACG A 480 
CTCAGCATCC TCATCCCTCT CCTCCAAGGC TTCCGGTTCA GOGCTCGGTG GCGAGTCCTC 540 
GGCTCAGCOC ACAGCACTCC OCCAGGGAGA GCATGCCAGG TGGOCOCAGC COCGTGGOCC 600 
CGCAGAGGAG GGCAGAGCOC CTGGTGAGAA ACAG AGGCCC CGGGCAAGTC AGGGGGCACC 660 
CTCGGCCATC AGCAGGCACA GTCCCGGGCC GACGOCCCAG CCCGACTGTA GCCTCAGGAC 720 
CGGOCAG AGG AGCGTCCAGG TGTOGGTCAC CTCGTCGTGC TCCCAGCTGT CCTCCTCCTC 780 
GGGCTCATCC TCCTCATCCG TGGCGCCCGC TGCCGGCACG TGGGTCCTGC AGGCCTCCCA 840 
GTGCTOCTTG AOCAAGGCCT GCOGCCAGCC ACCCATTGTC TTCTTGOCCA AGCTCGTGTA 900 
CGACATGGTT GTGTCCACTG ACAGCAGTGG GCTGGOCAAG GCOGGCTOOC TGCTGOCCIC 960 
CCCCTCGGTC ATGTGGGOCA GCTCTTTCCG CTCCCTGCTC AGCAAGACXA TGACATCCAC 1020 
CGAGCAGTOC CTCTACTACC GGCAGTGGAC GGTGOCCCGG CCCAGCCACA TGGACTACGG 1080 
CAAOCGGGGC GAGGGOCOCG TGGAOGGCTT CCACOCOCXjC AGGCTGCTOC TCAGCGGOOC 1140 
CCCTCAGATC GGGAAGACAG GTGCCTAOCT GCAGTTCCTC AGTGTOCTGT CCAGGATGCT 1200 
TGTTCGGCTC ACAGAAGTGG ATGTCTATGA CGAGGAGGAG ATCAATATCA ACCTCAG AGA 1260 
AGAATCTGAC TCGCATTATC TCCAGCTTAG CGACCCCTGG (XAGACCTGG AGCTGTTCAA 1320 
GAAGTTGCOC TTTGACTACA TCATTCACGA CCCGAAGTAT GAAG ATGCCA GCCTGATTTG 1380 
TTCGCACTAT CAGGGTATAA AG AGTG AAGA CAGAGGGATG TCCCGGAAGC CGGAGGACCT 1440 
TTATGTGCGG CGTCAGACGG CACGG ATG AG ACTGTCCAAG TACGCAGOGT ACAACACTTA 1500 
OCAOCACTGT GAGCAGTGCC ACCAGTACAT GGGCTTOCAC CCOCGCTACC AGCTGTATGA 1560 
GTCCACOCTO CAOGOCTTTG CC 1 1C 1 C1 1 A CTCCATGCTA GG AG AGGAGA TCCAGCTGCA 1620 
CTTCATCATC OCCAAGTOCA AGGAGCACCA CI 1 A G 1 C 1 1C AGOCAAOCTG GAGGOCAGCT 1680 
GGAG AGCATG CGACTACCOC TOGTGACAGA CAAGAGCXAT GAATATATAA AAAGTCCGAC 1740 
ATTCACTCCA AOCACXXKSOC GTCAOG AACA TGGGCTCTTT AATCTGTACC ACX3CAATGGA 1800 
CGGTGCCAGC CATTTGCACG TGCTGGTTGT CAAGGAATAC GAGATGGCAA TTTATAAGAA 1860 
ATATTGGOGC AAOCACATCA TGCTGGTGCT COOCAGTATC TTCAACAGTG CTGG AGTTGG 1920 
TGCTGCTCAT TTCCTCATCA AGG AGCTGTC CTACCATAAC CTGGAGCTCG AGCGGAACCG 1980 
GCAGG AGGAG CTGGGAATCA AGCCGCAGG A CATCTGGCCT TTCATTGTGA TCTCTGATGA 2040 
CTCCTGOGTG ATGTGGAACG TGGTGGATGT CAACTCTGCT GGGGAG AGAA GCAGGGAGTT 2100 
CTOCTGGTOG GAAAGGAACG TGTCTTTGAA GCACATCATG CAGCACATCG AGGCGGCCCC 2160 
CGACATCATG CACTACGCTC TGCTGGGCCT GOGG AAGTGG TCCAGCAAGA CCCGGGCCAG 2220 
CGAGGTGCAA GAGCOCTTCT OCOGCTGCCA CGTGCACAAC TTCATCATOC TGAACGTGG A 2280 
CCTGAOCCAG AACGTGCAGT ACAACCAG AA CCGGTTCCTG TGTGAOG ATG TAG ACTTCAA 2340 
CCTGCGGGTG CACAGCGCCG GCCTCCTGCT CTGCCGGTTC AACOOCTTCA GCGTGATGAA 24O0 
GAAGCAG ATC GTGGTGGGCG GCCACAGGTC CTTCCACATC ACATCCAAGG TGTCTGATAA 2460 
CTCTGCCGCG GTCGTGCCGG CCCAGTACAT CTGTGCCCCG GACAGCAAGC ACACGTTCCT 2520 
CGCAGCGOOC GCCCAGCTCC TGCTGGAGAA GTTCCTGCAG CACCACAGCC AOCICTILTI" 2580 
GGCGCTGTGC CTGAAGAACC ATG ACCACCC AGTGCTGTCT GTCGACTGTT ACCTG AACCT 2640 
GGGATCTCAG ATTTCTGTTT GCTATGTGAG CTOCAGGCCC CACTCTTTAA ACATCAGCTG 2700 
CTOGGACTTG CTGTTCAGTG GGCTGCTGCT GTACCTCTGT GACTCTTTTG TGGGAGCTAG 2760 
CTTTTTG AAA AAGTTTCATT TTCTGAAAGG TGOGACGTTG TGTGTCATCT GTCAGGACCG 2820 
GAGCTCACTG OGCCAGACGG TOGT00GOCT GGAGCTOGAG GAOG AGTGGC AGTTCGGGCT 2880 
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GCGCGATC AG TTCCAGACCG CCAATGCCAG GGAAGACCGG CCGCTCTTTT TTCTGACGGG 2940 
ACGACACATC JjQAGGAAGAC AGOGGCGAGT TTTCTG AAGA GATGAGTGCT CAG AGCCCTC 3000 
ATGCTGTTGA GGCTAAAGGG AGGOCTGGAA OGGTGGGGCG TTTGACTGGA ATGG ACCCCA 3060 
GGG ACTGTCC AGGTGCAGOC CCTCCTAGTA CACATGGGCC OCOGAGGCCG TGGTCCTGGG 3120 
AGOCAGGAAG ACTCCGCAGT GGGTG AGAAT GAAAACTTGA GACTCOCAAG TTCTGGOCCA 3180 
GCCCATTGCT CTGGGCTGTT TTAAAGCCCA TTTCACGAGG AACAAAGATT TACTTCCTGT 3240 
CCTGOCATTC GTGTGCTTCC ATGGACAAAC CTO ATTTTTT TCTCTTAGTT CTAAAGAATC 3300 
TTGGGTTATT TTGTAGCGGT GOCAGTATTT CAGTAGATGG G ATTTCAGCC AAGTAGGTTC 3360 
CCCTGTAAOC TCCTACAAAG CAATATTCCA AAGGAACATT TTAACTGTAA AGGCTGGAGA 3420 
CAAGAAAAAA TAAGTAGATC GTTTTAATAA CAATTATTTA ATTGCCTATA AOTTTGCTGT 3480 
TTCAGAGGCT AGCOCAAAGG CATCAAATTT AATAAAGTTA AACAAATTGA TTTACTTCAG 3540 
AGCAAATATG ATCCTATTAA AATAATATAG GGTAAATACC CTAOCTCTTA GAAAGGGCAA 3600 
AAATGCAAAG AAGCTTTCTT TAAAACTAAA AGGGTTTTTT GGGGGGGGAG TTGGOGGGGA 3660 
GGAAATAAGG CTAACAG AGO TTGACCTAAA ATTAGCCTTA CAAAGGAGAA AGGAOCACAT 3720 
TGCTTACTTO AAACAGACAA TGAAAACAAC CAAAGTGATA TATAAAATAG TTGATGAGAA 3780 
CTAGACTTAT GACTGTAGTT TACTAGAGTT TAGTTTTCAG TTGCTGAAGT AGCTCATTTT 3840 
CTCTTACTAA TCTTTGGTTC CTCAGGGAAG AATCTCACTT GACTAG AGAG GAGGTGGGAA 3900 
CAGAAGAGAG AAGGAGGCAG GG A G ATGTAT TTCTTAG GGC TCAOCOCTTC ACAGACTGAC 3960 
AGAATGGTTT TGTTTTGTTT TGTTTTGTTT TGTTTTGTTT TTG AG ATGGA CTCTAGCTCT 4020 
GTCACOCAGG CTGG AGTGCA GTGGTGOGAT CTCGGCTCAC TGCAAGCTCC GCCTCCOGGG 4080 
TTCTCACCAT TCTCCTGCCT CAGCCtCCCG AGTAGCTGGG ACTACAGGOG OCCAOCACCA 4140 
CGOCOGGCTA ATTTTTTGTA TTTTTTAGTA G AGAOGGGGT TTCACCATGT TAGCCAGGAT 4200 
GGTCTOG ATC TOCTGACCTC GTGATCCGCC CGCCTOGGCC TOOCAAAGTG CTGGOATTAC 4260 
AGGCGTGAGC CACCGTGCCT GCCOCAGAAT GGTTTTTAAA GCCACAGTTG AGAGGCCACC 4320 
CATTGCCCGG OGGCTGGACA GTGATCATCT TGTTCATCTT GTTCAGTCCT TTCTTGTGTO 4380 
ATTGGAATTA TTCATOOOCT TTGAAAG ATG AGAAGGTTGA GATGCAAAGA GTCTACCTTT 4440 
CCAAGTTCTC ACTGCTGGAA AGAGCTAGAA GCACAGTTCA AAGTTCTGGC TTCTGGACTC 4500 
TGCAGTCCAG GTCTCCCTTC TCOCACTTGC CTACCCTCAA TGOCACACTG TTTTTGAAGT 4560 
GGCOCATAAC TTG AAGGAAA AGTTTAAAGA CAGTTCAATT TAATCATCAG AATGCATTCT 4620 
TTITTTTTTC GG AGACGG AG TTTCACTCTT GCTGOCCAGG CTGGAGTGCA ATGGTGCAAT 4680 
GATCTCGGCT CACTGCAACC TCTGCCTCCT GGGTTCAAGT GATTCTGCAG OCTCAGCCTC 4740 
CCGAGTAGCT GGGATTATGG GCGOCCACCA CCATGOCCAG CTAATTTTTG TATTTTTTTT 4800 
TTTTAGTAGA GATGGGGTTT CGOCAGGTTG GCCAGGCTGG TCTTGTOAAC TCCTGGCCTC 4860 
AGGTGATCTG CCCAOCTCAT CCTCCAAAAG TGCTGGGATT ACAGGCATG A GOCACTGCGC 4920 
CTGGCCTCAG AATGCATTCT TACACATCTA TCCTAGACAT TTATAAGCAC TCTAATGGAT 4980 
AACAATCCAA GAATAAATGA TTGTAAAAGA TGATGCCGAA GAGTTGATGT CAATCTTTTT 5040 
TTCCTAAGAA AAAAAGTOCG CGAGTATTAA ATATTTAGAT CAATGTTTAT AAAATGATTA 5100 
CTTTOTATAT CTCATTATTC CTATTTTGGA ATAAAAACTG ACCTTCTTTA ATCATATACT 5160 
TO TCTTT TOT AAATAGCAGC TTTTGTGTCA TTCTOCCCAC TTTATTAGTT AATTTAAATT 5220 
GGAAAAAACC CTCAAACTAA TATTCTTGTC TGTTCCAGTC TTATAAATAA AACTTATAAT 5280 
GCATG 



SEQ ID HOrt54 PFD6 P 

Protein Accessions flP_055483J 

1 11 21 31 41 51 
I I I I I I 

MWQKEDVEW RPQTYLELEG LPCttJFSGM DPHGESLPRS LRYCDLRUN SSCLVRTALE 60 
QELGLAAYFV SNHVPLEKGA RKEALESDAE KLSSTDNEDE ELGTEGSTSE KRSFMKRERS 120 
RSHDSASSSL SSKASGSALG GESSAQPTAL PQGEHARSPQ PRGPAEEGRA PGEKQRPRAS 180 
QGPPSAISRH SPGPTPQPDC SLRTGQR5VQ VSVTSSCSQL SSSSGSSSSS VAPAAGTWVL 240 
QASQCSLTKA CRQPPIVFLP KLVYDMWST DSSGLPKAAS LLPSPSVMWA SSFRPLLSKT 300 
MTSTEQSLYY RQWTVPRPSH MDYGNRAEGR VDGFHPRRLL LSGPPQIGKT GAYlJQFLS VL 360 
SRMLVRLTEV DVYDEEEINI NLREESDWHY LQLSDPWPDL ELFKKLPFDY IIHDPKYEDA 420 
SliCSHYQGI KSEDRGMSRK PEDLYVRRQT ARM RLSKYAA YNTYHHCEQC HQYMGFHPRY 480 
QLYESTLHAF AESYSMLGEB IQLHFHPKS KEHHFVFSQP GGQLESMRLP LVTDKSHEYI 540 
KSPTFTPTTG RHEHGLFNLY HAMOGASHLH VLWKEYEMA IYKKYWPNHI MLVLPSIFNS 600 
AGVGAAHFLI KELSYHNLEL ERNRQEELGI KPQDIWPFTV ISDDSCVMWN WDVNS ACER 660 
SREFSWSERN VSLKHIMQHI EAAPDIMHYA LLGLRKWSSK TRASEVQEPF SRCHVHNFH 720 
LNVDLTQNVQ YNQNRFLCDD VDFNLRVHSA GLLLCRFNRF SVMKKQIWG GHRSFHTTSK 780 
VSDNSAAWP AQYICAPDSK HTFLAAPAQL LLEKFLQHHS HLFFPLSLKN HDHPVLSVDC 840 
YLNLGSQISV CYVSSRPHSLNISCSDLLFS GLLLYLCDSF VGASFLKKFH FLKGATLCVI 900 
CQDRSSLRQT WRLELEDEW QFRLRDEFQT AN AREDRPLF FLTGRJfl 

SEQ 10 KO:155 PFC8 DNA SEQUENCE 

Nucteic Acid Accession ff: NM.000522 

Coding sequence: M 187 (undenTned sequences coiTesponil to start and stop codons) 
1 11 21 31 41 51 

^n^CAGOCT (XGTGCTCCT CCACCCCCGC TGG ATCGAGC CCACCGTCAT GTTTCTCTAC 60 
GACAACGGCG GCGGCCTCGT GGCCGACGAG CICAACAAGA ACATGGAAGG GGCGGOGGCQ 120 
GCTGCAGCAG CGGCTGCAGC GGCGGCGGCT GCCGGGGCCG GGGGCGGGGG CTTCCCCCAC 180 
CCGGCGGCTG CGGCGGCAGG GGGCAACTTC TDGGTGGCGG CCGCGGCCGC GGCTGCGGCG 240 
GCCGCCGCGG CCAACCAGTG CCGCAACCTG ATGGCGCACC CGGCGCCCTT GGCGCCAGG A 300 
GCCGCGTCCG CCTACAGCAG CGCCCCCGGG GAGGCGCCCC CGTCGGCTGC CGCCGCTGCT 360 
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GCCGCGGCTG CCGCTGCAGC CGCCGCCGCC GCCGCCCCGT COTCCTCGGG AGGTCCCGGC 420 
C0GGOGGGOC OGGOGGCGGC AGAGGOOOOC AAGCAATGCA OGCCCTGCTC GGCAGOGGOG 480 
CAG AGCTOOT CGGGGCOCGC GGOGCTGOOC TATGGCTACT TOGOCAGCGG CTACTACCCG 540 
TGCGCCCGCATXKKjCCCXKXICCCCAACGCX: ATCAACrrCCTGCOCCCAGCCCCCCICGGCC 600 
GCCGCCGCCG OCGOCTTOGC GG ACAAGTAC ATGGATAOCG OCGGCOCAGC TGCCG AGGAG 660 
TTCAGCTCCC GCGCTAAGG A GTTCGCGTTC TACCACCAGG GCTAOGCAGC OGGGOCTTAC 720 
CACCAOCATC AGCCCATGCC TGGCTACCTG GATATGCCAG TGGTGCCGGG CCTCGGGGGC 780 
COCGGGGAGT GGOGOCACGA ACCCTTGGGT CTTCCCATGG AAAGCTACCA GCOCTGGGCG 840 
CTGCCCAACG GCTGGAACGG CCAAATGTACTGCCCCAAAG AGCAGGCGCA GCCTCCCCAC 900 
CTCTCGAAGT OCACTCTGOC GG ACGTGGTC TCCCATCCCT OGG ATGCCAG CTOCTATAGG 960 
AGGGGGAGAA AGAAGOGCGT GCCTTATAOC AAGGTGCAAT TAAAAGAACT TG AACGGGAA 1020 
TAOGCCACG A ATAAATTCAT TACTAAGGAC AAACGGAGGC GGATATCAGC CACG ACGAAT 1080 
CTCTCTGAGC GGCAGGTCAC AATCTGGTTC CAGAACAGG A GGGTT AAAGA G AAAAAAGTC 1140 
ATCAACAAAC TGAAAACCAC TAGTTAA 



1 11 21 31 41 51 
I I I I I I 

MTASVLLHPR WIEPTVMFLY DNGGGLVADE LNKNMEGAAA AAAAAAAAAA AGAGGGGFPH 60 
PAAAAAGGNF SVAAAAAAAA AAAANQCRNL MAHPAPLAPG AASAYSSAPG EAPPS AAAAA 120 
AAAAAAAAAA AAASSSGGPG PAGPAAAEAA KQCSPCSAAA QSSSGPAALP YGYK5SGYYP 180 
CARMGPPPNA IKSCPQPPSA AAAAAFADKY MDTAGPAAEE FSSRAKEFAF YHQGYAAGPY 240 
HHHQPMPO YL DMPWPGLGG PGESRHEPLG LPMES YQPWA LPNGWNGQMY CPKEQAQPPH 300 
LWKSTLPDW SHPSDASSYR RGRKKRVPYT KVQLKELERE YATNKFTTKD KRRRIS ATTN 360 
LSERQVTIWF QNRR VKEKKV INKLKTTS 



SEQ ID H0:157 PFA3 DNA SEQUENCE 

Nucleic Add Accession* AW102723 

Coding sequence: 623-2676 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

COCTTATGGC GATTGGGCGG CTGCAGAGAC CAGGACTCAG TTCGCCTGCC CTAGTCTG AG 60 
CCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAGAAG CAGGTTTCAG TTGCAGAGTT 120 
TTCCTACACT ITTCCTGCGC TAGAGCAGOG AGCAGCCTGG AACAGACCCA GGCGGAGGAC 180 
ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACCC CAGOGGGGCG TG ATCTCACC 240 
ATGTGCGGAT TTGCGAGGCG CGCCCTGGAG CTGCTAGAGA TOOGG AAGCA CAGCCOCGAG 300 
GTGTGOGAAG CCACCAAG AC TGCGGCTCTT GGAG AAAGOG TGAGCAGGGG GCCACCGOGG 360 
TCTCCGGCCT GTCTGCACCC TGTCGCCTG A GCTGOCTG AC AGTG ACAATG ACATCCCAGT 420 
TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TCTCATATAA G AACTACAGC 480 
TCATCAGG AG GAGATCGCAG CAGGGTAAGA G ACACCAACA CCATjffTTCTG CACG AAGCTC 540 
AAGG ATCTCA AG ATCACAGG AG AGTGTCCT TTCTCCTTAC TGGCACCAGG TCAAGTTCCT 600 
AACGAGTCTT CAGAGG AGGC AGCAGGAAGC TCAGAGAGCT GCAAAGCAAC CGTGCCCATC 660 
TGTCAAGACA TTCCTGAG AA GAACATACAA GAAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 
AGCCGAGTCT ATCTTCACAC TTTGGCAG AG AGTATTTGCA AACTG ATTTT CCCAGAGTTT 780 
GAAOGGCTGA ATGTTGCACT TCAGAG AACA TTGGCAAAGC ACAAAATAAA AG AAAGCAGG 840 
AAATCTTTGG AAAGAG AAGA CTTTG AAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 
CCAGTGGAGT TATCAAAGAA TCTCTTGGTG AAGAGGTTTT TAAAATATGT TACGAGG AAG 960 
ATG AAAACAT CCTTGGGGTG GTTGGAGGCA CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 
CCTTCTG AAA CAGAGCAGCC ATTGCCAAG A AGCAGGAAAA AGGGGCAGCT TG AGG ACGCC 1080 
TCCATTCTAT GCCTGG ATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 
AG AACCACCT CCCTG ATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATG AA 1200 
ACGGAAGTGG AAGTGTCGTT AATGCCTCCC TGCTTCCATA ATGATTGCAG CGAOTTTGTG 1260 
AATCAGCCCT ACTTGTTGTA CTCCGTTCAC ATG AAAAGCA CCAAGOCATC CCTGTCCCCC 1320 
AGCAAACCCC AGTCCTCGCT GGTG ATTCCC ACATCGCT AT TCTGCAAGAC ATTTCCATTC 1380 
CATTTCATGT TTG ACAAAG A TATGACAATT CTGCAATTTG GCAATGGCAT CAGAAGGCTG 1440 
ATG AACAGGA GAGACTTTCA AGG AAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1500 
AAAATCAACC AGACCTTTAG CGGGATCATG ACTATGTTGA ATATGCAGTT TGTTGTACGA 1560 
GTGAGGAGAT GGGACAACTC TGTGAAGAAA TCTTCAAGGG TTATGGACCT CAAAGGCCAA 1620 
ATG ATCTACA TTGTTGAATC CAGTGCAATC TTGTTTTTGG GGTCACCCTG TGTGG ACAGA 1680 
TTAG AAGATT TTACAGGACG AGGGCTCTAC CTCTCAGACA TCCCAATTCA CAATGCACTG 1740 
AGGGATGTGG TCTTAATAGG GGAACAAGCC CGAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1800 
GGGAAGCTGA AGGCTACCCT TG AGCAAGCC CACCAAGCCC TGGAGGAGGA GAAGAAAAAG 1860 
ACAGTAGACC TTCTGTGCTC CAT ATTTCCC TGTG AGGTTG CTCAGCAGCT GTGGCAAGGG 1920 
CAAGTTGTGC AAGCCAAGAA GTTCAGTAAT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1980 
TTCACTGCCA TCTGCTCCCA GTGCTCACCO CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 
TACACTCGCT TCGACGAGCA GTGTGGAG AG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 
ATGOCTATTG TCJTGGCTTGG GGGATTACAC AAAGAG AGTG ATACTCATGC TGTTCAG ATA 2160 
GCGCTG ATGG CCCTGAAGAT GATGG AGCTC TCTGATGAAG TTATGTCTCC CCATGGAGAA 2220 
OCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT OGTTGGAGTT 2280 
AAAATGCCCC GTTACTGTCT TTTTGGAAAC AATGTCACTC TGGCTAACAA ATTTGAGTCC 2340 
TGCAGTGTAC CACGAAAAAT CAATGTCAGC CCAACAACTT ACAGATTACT CAAAGACTGT 2400 
CCTGGTTTCG TGTTTACCCC TCGATCAAGG GAGG AACTTC CACCAAACTT CCCTAGTGAA 2460 
ATCOCCGGAA TCTGCCATTT TCTGG ATGCT TACCAACAAG GAACAAACTC AAAACCATGC 2520 
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TTCCAAAAGA AAG ATGTGGA AG ATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAQA 2580 
TTAGCAACCT ATA TACCT AT TTATAAGTCT TTOOO GTTTQ ACTCATTGAA GATGTGTAGA 2640 
GOCTCTGAAA GCACTTTAGG GATTGTAGAT GG CTAA CAAO CAOTATTAAA ATTTCAGGAG 2700 
OCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 
TCTTCAAGAA AAAAAAAAAA ACCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2220 
AACCAGCACT TACTACCTGT ACTCAAAATT CAGCACCTTG TACATATATC AG ATAATTGT 2880 
AGTCAATTGT ACAAACTG AT GGAGTCACCT GCAATCTCAT ATOCTGGTGG AATGCCATGG 2940 
TTATTAAAGT GTGTTTGTGA TAGTTOTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
AAAA 

SEQ tD Mfe158 PFA3 Profrin agfflffig 
Protein Accession t. NP.000647.1 

1 11 21 31 41 51 
I I I I I I 

MPCTKLKDUC ITGECPFSLL APGQVFNESS EEAAGSSESC KATVPICQDI PEKNIQESLP 60 
QRKTSRSRVY LHTLAESICK LIFPEFERLN VALQRTLAKH KIKES RKSLE REDFEKTIAE 120 
QAVQQSPVEL SKNLLVKRFL KYVTRKMKTS LGWLBAPLKI FKQLQYPSBT EQPLPRSRKK 180 
GQLEDASILC LDKEDDFLHV YYFFPKRTTS ULPGHKAA AHVLYETEVE VSLMPPCFHN 240 
DCSEFVNQPY LLYSVHMKST KPSLSPSKPQ SSLVIPTSLF CKTFPFHFMF DKDMTILQFG 300 
NGJRRLMNRR DFQGKFNFEY FEXLTPKINQ TFSGIMTMLN MQFWRVRRW DN5VKKSSRV 360 
MDLKGQMXYI VESSAILFLG SPCVDRLEDF TGRGLYLSDI PIHNALRDW LIGEQARAQD 420 
GLKKRLGKLK ATLEQAHQAL EEEKKKTVDL LCSIFPCEVA QQLWQGQWQ AKKFSNVTML 480 
FSDXVGFTAI CSQCSPLQVI TMLNALYTRF DQQOGELDVY KVETIAMHV WLGGLHKESD 540 
THAVQIALMA LKMMELSDEV MSPHGEP1KM RIGLHSGSVF AGWGVKMPR YCLFGNNVTL 600 
ANKFESCSVP RKINVSPTTY RLLKDCPGFV FTPRSREELP PNFPSEIPGI CHFLDAYQQG 660 
TNSKPCPQKK DVEDASQFFR QSERNRLATY IPIYKSLGFD SLKMCRASES TLGIVDG 

8EQtDN0:159PFA1 OKA SEQUENCE 

Nud^c Add Accession I: KM.004362 

Coding sequence: 102-1934 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

OGOOGGOOGG ACTGGTCTGA AGAGACGCGG GGACAAAGTG GCAACGACTT GGACATCTGA 60 
GCTGTCACTG COOAAAACAG GCCGCAAGAO AGATAATCAA TATGCATTTC CAAGCCTTTT 120 
GGCTATGTTT GCKjTCTTCTG TTCATCTCAA TTAATGCAGA ATTTATGGAT GATGATGTTG 180 
AGACGGAAG A CTTTGAAGAA AATTCAGAAG AAATTGATGT TAATGAAAGT GAACTTTCCT 240 
CAGAGATTAA ATATAAG ACA CCTCAACCTA TAGG AGAAGT ATATTTTGCA GAAACTTTTG 300 
ATAGTGGAAG GTTGGCTGGA TGGGTCTTAT CAAAAGCAAA GAAAGATG AC ATGGATGAGG 360 
AAATTTCAAT ATACGATGGA AG ATGGG AAA TTGAAGAGTT GAAAGAAAAC CAGGTACCTG 420 
GTGACAGAGG ACTGGTATTA AAATCTAGAG CAAAGCATCA TGCAATATCT GCTGTATTAG 480 
CAAAACCATT CATTTTTGCT G ATAAACCCT TGATAGTTCA ATATGAAGTA AATTTTCAAG 540 
ATGGTATTGA TTGTGGAGGT GCATACATTA AACTCCTAGC AGACACTGAT GATTTG ATTC 600 
TGG AAAACTT TTATGATAAA ACATCCTATA TCATTATGTT TGGACCAG AT AAATGTGGAG 660 
AAGATTATAA ACTTCATTTT ATCTTCAG AC ATAAACATOC CAAAACTGGA GTTTTCGAAG 720 
AGAAACATGC CAAACCTCCA GATOTAG ACC TTAAAAAGTT CTTTACAGAC AGGAAGACTC 780 
ATCTTTATAC CCTTGTGATG AATCCAGATG ACACATTTG A GGTGTTAGTT GATCAAACAG 840 
TTGTAAACAA AGGAAGCCTC CT AGAGG ATG TGGTTOCTCC TATCAAACCT CCCAAAG AAA 900 
TTGAAGATCC CAATG ATAAA AAACCTGAGG AATGGGATGA AAGAGCAAAA ATTCCTGATC 960 
CTTCTGCCGT CAAAGCAGAA G ACTGGG ATG AAAGTG AACC TGCCCAAATA GAAGATTCAA 1020 
GTGTTGTTAA ACCTGCTGGC TGGCTTGATG ATGAACCAAA ATTTATCCCT G ATCCTAATG 1080 
CTGAAAAACC TGATGACTGG AATGAAG ACA CGGATGGAGA ATGGGAGGCA CCTCAGATTC 1140 
TTAATCCAGC ATGTCGGATT GGGTGTGGTG AGTGG AAACC TCCCATG ATA G ATAACCCAA 1200 
AATACAAAGG AGTATGGAGA CCTCCACTGG TCGATAATCC TAACTATCAG GGAATCTGGA 1260 
GTCCTCGAAA AATTCCTAAT CCAGATTATT TCGAAGATGA TCATCCATTT CTTCTGACTT 1320 
CTTTCAGTGC TCTTGGTTTA G AGCTTTGGT CTATGACCTC TGATATCTAC TTTGATAATT 1380 
TTATTATCTG TTCGGAAAAG GAAGTAGCAG ATCACTGGGC TGCAGATGGT TGGAGATGGA 1440 
AAATAATGAT AGCAAATGCT AATAAGCCTG GTOTATTAAA ACAGTTAATG GCAGCTGCTG 1500 
AAGGGCACCC ATGGCTTTGG TTGATTTATC TTGTG ACAGC AGGAGTGCCA ATAGCATTAA 1560 
TTACTTCATT TTGTTGGOCA AG AAAAGTAA AGAAAAAACA TAAAGATACA G AGTATAAAA 1620 
AAACCG ACAT ATGTATACCA CAAACAAAAG G AGTACTAGA GCAAGAAGAA AAGG AAGAGA 1680 
AAGCAGCCCT GGAAAAACCA ATGGACCTGG AAGAGGAAAA AAAGCAAAAT GATGGTGAAA 1740 
TGCTTGAAAA AGAAGAGGAA AGTGAACCTG AGGAAAAGAG TGAAGAAGAA ATTGAAATCA 1800 
TAGAAGGGCA AGAAGAAAGT AATCAATCAA ATAAGTCTGG GTCAGAGG AT GAGATGAAAG 1860 
AAGCAGATG A GAGCACAGGA TCTGGAG ATG GGCCGATAAA GTCAGTAOGC AAAAG AAGAG 1920 
TACGAAAGGA CJA&ACTAGA TTGAAATATT TTTAATTCCC GAGAGGATGT TTGGCATTGT 1980 
AAAAATCAGC ATGCCAGACC TG AACTTTAA TCAGTCTGCA CATCCTGTTT CTAATATCTA 2040 
GCAACATTAT ATTCTTTCAG ACATTTATTT TAGTCCTTCA TTTCCGAGGA AAAAGAAGCA 2100 
ACTTTGAAGT TACCTCATCT TTG AATTTAG AATAAAAGTG GCACATTACA TATCGGATCT 2160 
AAG AG ATTAA TACCATTAGA AGTTACACAG TTTTAGTTGT TTGG AGATAG TTTTGGTTTG 2220 
TACAG AACAA AATAATATGT AGCAGCTTCA TTGCTATTGG AAAAATCAGT TATTGG AATT 2280 
TCCACTTAAA TGGCT AT ACA ACAATATAAC TGGT AGTTCT ATAATAAAAA TGAGCATATG 2340 
TTCTGTTGTG AAG AGCTAAA TGCAATAAAG TTTCTGTATG GTTGTTTGAT TCTATCAACA 2400 
ATTG AAAGTG TTGTATATQA COCACATTTA CCTAGTTTGT GTCAAATTAT AGTTACAGTG 2460 
AGTTGTTTGC TTAAATTAT A GATTOCTTTA AGGACATGCC TTGTTCATAA AATCACTGG A 2520 
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TTATATTCCA GCATATTTTA C ATTTOA ATA CAAOGATAAT GGGTTTTATC AAAACAAAAT 2580 
GATGTACAGA TTTTTTTTCA AGTTTTTATA GTTOCTTTAT GOCAGAGTGG TTTACCOCAT 2640 
TCACAAAATT TCTTATGCAT ACATTGCTAT TGAAAATAAA ATTTAAATAT TTTTTCATCC 2700 
TGAAAAAAAA 

SEQ m NO:160 PFA1 Protein awtumt 
Protein Accession #: NPJXW353.1 

10 1 11 21 31 41 51 
I I ! I I I 

MH FQ AFW LCL GLLHS1NAE FMPPDVETED FEENSEHD V NE5ELSSBK YKTPQPIGHV 60 
YFAETFDSGR LAGWVLSKAK KDDMDEEISI YDGRWEEEL KENQVPGDRG LVLKSRAKHH 120 
1C A1SAVLAKPF IFADKPUVQ YEVNFQDGJD CGGAYIKLLA DTDDULENF YDKTSYPMF 180 
15 GPDKOGEDYK LHFIFRHKHP KTGVFEEKHA KPPDVDLKKF FTDRKTHLYT LVMNPDDTFB 240 
VLVDQTWNK GSLLEDWPP IKPPKEIEDP NDKKPEEWDE RAKIPDPS AV KPEDWDESEP 300 
AQD5DSSWK PAG WLDDEPK FIPDPNAEKP DDWNEDTDGE WEAPQILNPA CRfGGGEWKP 360 
PMIDNPKYKG VWRPPLVDNP NYQGIWSPRK 1PNPDYFEDD HPFLLTSFSA LGLELWSMTS 420 
A DIYFDNFnC SEKEVADHWA ADOWRWKIMI ANANKPG VLK QLMAAAEGHP WLWIiYLVTA 480 
2\) GVP1AUTSF CWPRKVKKKH KDTBV gTDI QPQTKGVLE QBEKEBKAAL EKPMPLEEEK 540 
KQNDGEMLEK EEESEPEEKS EEEIEUEGQ EESNQSNKSG SEDEMKEADE STGSGDGPIK 600 
SVRKRRVRKD 

25 SEO ID KO:t61 PE29 DNA SEQUENCE 

Nucleic Add Accession r. NMJW5932 

Coding sequence 75-2216 (undefined sequences correspond lo start and stop codons) 
1 11 21 31 41 51 

30 | | | | i | 

GOGGAGOGOG GGCTCCCAGC G AAAGCAGCA GGGCAGGGAT CTGCGTTGGA GGAAGGGACT 60 
GCTCTGGTGC TAG AATGC TG TGCGTCGGAA GGCTGGGCGG CTTGGGAGCC AGAGCAGCAG 120 
CTCTOCCGCX: CCGCCGGGCa GGCCGGGGAA GCCTCGAAGC CGGQATCCGG GCCCGAAGGG 180 

0 - TCAGCAOCAG CTGGTCTCCC GTGGGCGCGG CCTICAATGT CAAGOCCCAG GGCAGOCGCT 240 

OD TGGACCTGTT CGGCGAGCGG GCGCGTCTTT TTGGAGTTCC TGAGCTGAGT GCCCCAG AAG 300 
GATTTCATAT TGCACAAGAA AAAGCCTTGA GAAAGACAG A ATTGCTTGTG GACOGTGCAT 360 
OTTCCACOCC AOCTOOGCCC CAOAOCOTGC TGATCTTCGA TOAOCTCKXS GATTOCTTAT 420 
GCAGAGTGGC CGACTTGGCT G ATTTTGTG A AAATCGCTCA OCCTG AGCCA GCATTCAG AG 480 
AAGCTGCGGA AGAAGCTTGT AGAAGTATTG GCACCATGGT AGAGAAGTTG AACACAAATG 540 

4U TGGATTTATA TCAAAGTTTG CAAAAATTAC TAGCTGATAA AAAACTTGTG GATrCCCl IU 600 
ATCCAGAAAC AAGGC3GAGTG GCTGAACIGT TTATGTTTGA TTTTGAAATT AGTGGAATCC 660 
ATCTAGACAA ACAAAAGCGT AAAAG AGCAG TGGACCTCAA TGTTAAAATC TTGGATTTGA 720 
GTAGTACATT TCTTATGGGA AOCAATTTTC CCAACAAG AT TGAGAAGCAT CTCTTACCAG 780 
AACACATTCG TOGTAACTTT ACATCTGCTG GGQATCATAT CATAATTGAT GGTCTGCACG 840 

45 CAGAATCAGC AG ATGACTTG GTGCGAGAAG CTGCTTATAA AATTTTTCTT TATCCCAATG 900 
CIXjGTCAATT G AAATGTTTA G AAG AATTGC TCAGCAGCAG AGATCTTCTG GCAAAGTTGG 960 
TGGGGTATTC CAGGTTTTCT CACAGGGCTC TCCAAGGAAC GATAGCTAAA AATCCAGAGA 1020 
CTGTCATGCA GTTCCTTGAA AAACTATCTG ACAAACTTTC TGAAAGAACT CTGAAAGATT 1080 

- TTGAGATGAT ACGAGGGATG AAAATGAAAC TGAATGCTCA AAATTGOGAA GTAATGCOCT 1140 
DU GGGAOOOCGC TTACTA CAGT GGTGTGATTC GTGCAGAAAC GTATAATATT GAGCCCAGCC 1200 

TATATTGCCC GTTXTTCTCT CTTGGAGCAT GCATGGAAGG OCTGAAT ATT TTGCTT AACA 1260 
GACTGTTGGG G ATTTCATTA TATGCAG AGC AGCCTGCAAA AGGAG AGGTG TGGAGCG AAG 1320 
ATGTCCGAAA ACTGGCTGTT GTTCATGAAT CTGAAGGATT GTTGGGGTAC ATTTACTGTG 1380 

- ATTTTTTTCA GCG AGCAGAC AAACCACATC AGO ATTGCCA TTTCACTATC CGTGG AGGCA 1440 
55 GACTAAAGGA AG ATGG AG AC TATCAACTCC CACTTGTAGT TCTTATGCTG AATCTTOCCC 1500 

GTTOCTCAAG G AGTTCTCCA ACTTTGCTAA CTCCTGGCAT G ATGGAAAAT CTTTTCCATG 1560 
AAATGGG ACA TGCCATGCAT TCAATGCTAG G AGGTACTCG TTAOCAACAC GTCACTGGGA 1620 
CCAGGTGCXX TACTGATTTT GCTGAGGTTC CTTCTATTCT GATGGAGTAC TTTGCAAATG 1680 

, A ATTATCOAGT AOTTAACCAA TTTGOCAG AC ATTATCAGAC TOG ACAGOCA CTGCCAAAAA 1740 

OU ATAT GGTGT C TCGTCTTTGT GAATCTAAAA AGGTTTGTGC TGCAGCTGAT ATGCAACTTC 1800 
AGGTCrTTTA TGCCACTCTG GATCAAATCT ACCATGGGAA GCATOCCCTO AGGAATTCAA 1860 
CCACAGACAT TCTCAAGGAA ACACAAG AGA AATTCTATGG CCTACCATAT GTTCCAAATA 1920 
CTGCCTGOCA GCTGCGATTC AGCCACCTCG TGGGGTATGG TGCTAGATAT TACTCTTACC 1980 
TCATGTCCAG AGCGGTCGCC TCCATGGTTT GGAAGGAGTG TTTTCTACAG GATCCTTTCA 2040 

05 ACAGGGCTGC CGGGGAGCGC TATCGCAGGG AGATGCTGGC CCACOGTGGA GGCAGGGAGC 2100 
CCATGCTCAT GGTTGAAGGT ATGCTTCAGA AGTGTCCTTC TGTTGATGAC TTCGTAAGTG 2160 
CCCTCGTTTC CGACTTGGAT CTGGACTTCG AAACTTTCCT CATGGATTCT GAAJAAAAGA 2220 
AACACTCTAC ACCTCTAATC AAGGTCATGT AGTAATG ACT TTGTTATAAA TGCTACAGCT 2280 
GTGAG AGCTT GTTTCTGATT GTTTCATTGT TCGCTTCTGT AATTCTGAAA AACTTTAAAC 2340 

/U TGGTAGAACT TGG AATAAAT AATTTGTTTT AATTAAAAAA AAAAAAAAAA AA 



75 



SEQ ID NOI162 PEZ9 PrrrfpJn aqyagg 

Protein Accession i: NP.00S923.1 

1 11 21 31 41 51 
I I I I I I 

MLCVGRLGGL GARAAALPPR RAGRGSLEAG 1RARRVSTSW SPVGAAFNVK PQGSRLDLFG 60 
ERARLFGVPE LSAPEGFH1A QEKALRKTEL LVDRACSTPP GPQTVUFDB LSDSLCRVAD 120 
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LADFVKIAHP EPAFREAAEE ACRSIGTMVE KLNTNVDLYQ SLQKLLADKK LVDSLDPETR 180 
RVAELFMFDF BSGIHLDKQ KRKRAVDLNV KILDLSSTFL MGTNFPNKIE KHLLPEHIRR 240 
NFTSAGDHII HX3LHAESPD DLVREAAYKI FLYPNAGQLK CLEELLSSRD LLAKLVGYST 300 
FSHRALQGTI AKNPETVMQF LEKLSDKLSE RTLKDFEMIR GMKMKLNAQN SEVMPWDPPY 360 
YSOVIRAERY NIEPSLYCPF FSLOACMEGL NUXNRLLGI SLYAEQPAKO BVWSEDVRKL 420 
A WHESEGLL GYIYCDFFQR ADKPHQDCHF TIRGGRLKED GDYQLPLWL MLNLPRSSRS 480 
SPTLLTPGMM ENLFHEMGHA MHSMLGRTRY QHVTGTRCPT DFAEVPSILM EYFANDYRW 540 
NQFARHYQTG QPLPKNMVSR LCESKKVCAA ADMQLQVFYA TLDQIYHGKH PLRNSTTDIL 600 
KETQEKFYGL PYVPNTAWQL RFSHLVG YGA RYYS YLMSRA VASMVWKECF LQDPFNRAAG 660 
ERYRREMLAH GGGREPMLMV EGMLQKCPSV DDFVS ALVSD LDLDFETFLM DSB 



SEQ ID N0:163 PEZ8 ONA SEQUENCE 

Nucleic Acid Accession I: AF103907 

CodngsGqusncs none (trtderfined sequerces correspond to start End stop codons) 



1 11 21 31 41 51 
I I I I I I 

ACAGAAGAAA TAGCAAGTGC CGAGAAGCTG GCATCAGAAA AACAGAGGGG AGATTTGTGT 60 
GGCTGCAGCC GAGGGAGACC AGGAAGATCT GCATGGTGGG AAGG ACCTGA TG ATACAG AG 120 
GAATTACAAC ACATATACTT AGTGTTTCAA TGAACACCAA GATAAATAAG TGAAGAGCTA 180 
GTCCGCTGTG AGTCTCCTCA GTGACACAGG GCTGGATCAC CATCGACGGC ACTTTCTGAa 240 
TACTCAGTGC AGCAAAGAAA GACTACAGAC ATCTCAATGG CAGGGGTG AG AAATAAGAAA 300 
GGCTGCTGAC TTTACCATCT GAGGCCACAC ATCTGCTGAA ATGGAGATAA TTAACATCAC 360 
TAGAAACAGC AAGATGACAA TATAATGTCT AAGTAGTG AC ATGTTTTTGC ACATTTCCAG 420 
CCOCTTTAAA TATCCACACA CACAGGAAGC ACAAAAGGAA GCACAGAGAT CCCTGGGAGA 480 
AATGO00GGC CGCCATCTTG GGTCATOG AT GAGGCTOGOC CTGTGCCTGG T0COGCTTGT 540 
GAGGGAAGGA CATTAG AAAA TG AATTGATG TGTTOCTTAA AGGATGGGCA GGAAAACAGA 600 
TCXnXjTTGTO GATATTTATT TG AACGGGAT TACAGATTTG AAATGAAGTC ACAAAGTGAG 660 
CATTACCAAT GAGAGGAAAA CAGACGAGAA AATCTTGATG GCTTCACAAG ACATGCAACA 720 
AACAAAATGG AATACTGTGA TGACATGAGG CAGCCAAGCT GGGGAGGAGA TAAOCACGGG 780 
GCAGAGGGTC AGGATTCTGG CCCTCCTGCC TAAACTGTGC GTTCATAACC AAATCATTTC 840 
ATATTTCTAA CCCTCAAAAC AAAGCTGTTG TAATATCTGA TCTCTACGGT TCCTTCTGGG 900 
CCCAACATTC TCCATATATC CAGCCACACT CATTTTTAAT ATTTAGTTCC CAGATCTGTA 960 
CTGTGACCTT TCTACACTGT AGAATAACAT TACTCATTTT GTTCAAAGAC CCTTCGTGTT 1020 
GCTGCCTAAT ATGTAGCTGA CIO 111 HO C TAAGGAGTGT TCTGGCCCAO GGGATCTGTG 1080 
AACAGGCTGG GAAGCATCTC AAGATCTTTC CAGGGTTATA CTTACTAGCA CACAGCATGA 1140 
TCATTACGGA GTGAATTATC TAATCAACAT CATCCTCAGT GTCTTTGCCC ATACTGAAAT 1200 
TCATTTOCCA CTTTTGTGCC CATTCTCAAG ACCTCAAAAT GTCATTGCAT TAATATCACA 1260 
GGATTAACI'i' 11 11 1 1 1 1 AA CCTGGAAGAA TTCAATGTTA CATGCAGCTA TGGGAATTTA 1320 
ATTACATATT TTGTTTTCCA GTGCAAAGAT GACTAAGTCC TTTATCCCTC CCCrTTGTTT 1380 
GATTTTTTTT CCAGTATAAA GTTAAAATGC TTAGCCTTGT ACTG AGGCTG TATACAGCAC 1440 
AGOCTCTCCC CATCCCTCCA GCCTTATCTG TCATCACCAT CAACCCCTCC CATACCACCT 1500 
AAACAAAATC TAACTTGTAA TTOCTTGAAC ATGTCAGGAC ATACATT ATT CCTTCTGOCT 1560 
GAGAAGCTCT TCCTTGTCTC TTAAATCTAG AATGATGTAA AGTTTTGAAT AAGTTGACTA 1620 
TCTTACTTCA TGCAAAG AAG GGACACATAT GAG ATTCATC ATCACATGAG ACAGCAAATA 1680 
CTAAAAGTGT AATTTGATTA TAAG AGTTTA GATAAATATA TGAAATGCAA GAGCCACAGA 1740 
GGGAATGTTT ATGGGGCACG TTTGTAAGCC TGGGATGTGA AGCAAAGGCA GGGAACCTCA 1800 
TAGTATCTTA TATAATATAC TTCATTTCTC TATCTCTATC ACAATATCCA ACAAGCTTTT 1860 
CACAGAATTC ATGCAGTGCA AATCOCCAAA GGTAAOCTTT ATCCATTTCA TGGTGAGTGC 1920 
GCTTT AGAAT TTTGGCAAAT CATACTGGTC ACTTATCTCA ACTTTGAGAT GTGTTTGTCC 1980 
TTGTAGTTAA TTG AAAGAAA TAGGGCACTC TTGTG AGCCA CTTTAGGGTT CACTCCTGGC 2040 
AATAAAG AAT TTACAAAG AG CTACTCAGGA CCAGTTGTTA AGAGCTCTGT GTGTGTGTGT 2100 
GTGTGTGTGT GAGTGTACAT GCCAAAGTGT GCCTCTCTCT CTTGACOCAT TATTTCAGAC 2160 
TTAAAACAAG CATGTTTTCA AATGGCACTA TG AGCTGCCA ATGATGTATC ACCACCATAT 2220 
CTCATTATTC TCCAGTAAAT GTGATAATAA TGTCATCTGT TAACATAAAA AAAGTTTG AC 2280 
TTCACAAAAG CAGCTGGAAA TGG ACAACCA CAATATGCAT AAATCTAACT CCTACCATCA 2340 
GCTACACACT GCTTGACATA TATTGTTAGA AGCACCTCGC ATTTGTGGGT TCTCTTAAGC 2400 
AAAATACTTG CATTAGGTCT CAGCTGGGGC TGTGCATCAG GCGGTTTGAG AAATATTCAA 2460 
TTCTCAGCAG AAGCCAGAAT TTGAATTCCC TCATCTTTTA GGAATCATTT ACCAGGTTTG 2520 
GAGAGGATTC AGACAGCTCA GGTGCTTTCA CTAATGTCTC TGAACTTCTG TCCCTCTTTG 2580 
TGTTCATGG A TAGTOCAATA AATAATGTTA TCTTTGAACT G ATGCTCATA GGAG AG AATA 2640 
TAAG AACTCT GAGTGATATC AACATTAGGG ATTCAAAGAA ATATTAGATT TAAGCTCACA 2700 
CTGGTCAAAA GGAACCAAG A TACAAAG AAC TCTGAGCTGT CATCGTCCCC ATCTCTGTGA 2760 
GCCACAACCA ACAGCAGGAC CCAACGCATG TCTGAG ATCC TTAAATCAAG GAAACCAGTG 2820 
TCATGAGTTG AATTCTCCTA TTATGGATGC TAGCTTCTGG GCATCTCTGG CTCrCCTCTT 2880 
GACACATATT AGCTTCTAGC CTTTOCTTO C ACGACTTTTA TCTTTTCTCC AACACATCGC 2940 
TTACCAATCC TCICTCrGCTCT Gl TOCTTT GG ACTTCCCC ACAAGAATTT CAACGACTCT 3000 
CAAGTCTTTT CTTCCATCCC CACCACTAAC CTGAATGCCT AGAOCCTTAT TTTT ATTAAT 3060 
TTCCAATAG A TGCTGCCTAT GOGCTATATT GCTTTAG ATG AACATTAG AT ATTTAAAGCT 3120 
CAAGAGGTTC AAAATCCAAC TCATTATCTT CTCT1 1CTTT CACCTCCCTG CTCCTCTCCC 3180 
TATATTACTG ATTGCACTGA ACAGCATGGT CCCCAATGTA GCCATGCAAA TGAGAAACCC 3240 
AGTGGCTCCT TGTGGTACAT GCATGCAAG A CTGCTGAAGC CAGAAGGATG ACTGATTACG 3300 
CCTCATGGGT GGAGGGG AOC ACTOCTGGGC CTTCGTGATT GTCAGGAGCA AG ACCTGAG A 3360 
TGCTCCCTGC CTTCAGTGTC CTCTGCATCT CCCCTTTCTA ATGAAGATCC ATAG AATTTG 342) 
CTACATTTGA GAATTCCAAT TAGOAACTCA CATGTTTT AT CTGCCCT ATC AATTTTTTAA 3480 
ACTTGCTGAA AATTAAGTTT TTTCAAAATC TGTCCTTGTA AATTACTTTT TCTTACAGTO 3540 
TCTTGGCATA CTATATCAAC TTTG ATTCTT TGTTACAACT TTTCTTACTC TTTTATCACC 3600 



365 



WO 02/30268 



PCT/US01/32045 



AA AGTGGCTT jTAT ICTCTT TATTATTATT AT fTTCTl 1 ' I ' ACTACTATAT TAOOTTOTTA 3660 
TTATTTTOTT CTCTATAOTA TCAATTTATT TGATTTAGTT TCAATTTATT TTTATTGCTG 3720 
ACTTTTAAAA TAAGTG ATTC GGGGGGTGGG AG AACAGGGG AGGGAGAGCA TTAGG ACAAA 3780 
TACCTAATGC ATGTGGGACT TAAAACCTAG ATG ATGGGTT GATAGGTGCA GCAAACCACT 3840 
ATGGCACAOG TATAOCTGTG TAACAAACCT ACACATTCTG CACATGTATC OCAG AACGTA 3900 
AAGTAAAATT TAAAAAAAAG TO A 



J.U Protein Accession #i 

SEQ ID Kfc164 PEZB DMA SEQUENCE 



15 



Kudeic Add Accession*: ABC28945 
Coding sequence: (undefined sequences correspond to start and stopcodons) 



1 11 21 31 41 51 
I I I I I I 

A32ATGATGA ACGTCCCCXK3 CGGAGGAGCG GCCGCGGTGA TGATGACGGG CTACAATAAT 60 
_ _ GGTCGCTGTC COCGGA AT TC TCTC TA CAGT GACTGCATTA TTGAGGAGAA GACGGTGGTC 120 
20 CTGCAGAAAA AAGACAATG A GGGCTTTGGA TTCGTGCTTC GAGGGGOCAA AGCTG ACACA 180 
CCCATTGAAG AATTCACACC AACACCGGCT TTCCCAGCCC TACAGTACCT GGAGTCCGTG 240 
GATGAAGGTG GGGTGGCGTG GCAAGCCGGA CTAAGGACCG GGGACTTCTT GATTGAGGTT 300 
AACAATG AG A ATGTTGTCAA AGTOGGCCAC AGGCAGGTGO TG AACATGAT OCGGCAGGGA 360 
_ _, GGGAATC ACC TGGTCCTTAA GGTGGTCACG GTG ACCAGGA ATCTGGACOC CGAOGACAOC 420 
25 GCCAGG AAG A AAGCTOOOCC GCCTOCAAAG CGGGCACOG A CCACAGCCCT CAOOCTGOGC 480 
TOCAAGTGCA TG ACCTCGGA GCTGGAGGAG CTCGTGG ATA AAG ATAAAGC OGAGGAGATA 540 
GTCCCGGCCT CCAAGCCCTC COGCGCTGCT GAGAACATGG CTGTGGAACC GAGGGTGGOG 600 
ACCATCAAGC AGOGGOCCAG CAGOCGGTGC TTCOCGGCGG GCTCAGACAT GAACTCTGTG 660 
TACGAACGCC AAGG AATCGC CGTGATGAOG OCCACTGTTC CTGGG AGCCC AAAAGCOCOG 720 
30 TTTCTGGGCA TCOCTOG AGG TACG ATGCG A AGGCAGAAAT CAATAG ACAG CAG AATCTTT 780 
CTATCAGG AA TAACAG AGG A AG AGCGGCAG TTTCTGGCTC CTOCAATGCT G AAGTTCACC 840 
AGAAGCCTGT OCATGOCGGA CACCTCTGAG G ACATCOGCC CTCCAGCGCA GTCTGTGOCC 900 
CCGTCCCCAC CACCACCTTC CCCAAOCACT TACAACTGCC CCAAGTCCCC AACTCCAAGA 960 
„ GTCTAOGGGA OG ATTAAGCC TGCGTTCAAT CAGAATTCTG COGCCAAGGT GTOCOCCGCC 1020 
35 ACCAGGTCOG ACACOGTGGC CACCATGATG AGGGAGAAGG GGATGTACTT CAGGAGAGAG 1080 
CTGGAOCGCT ACTGCTTGGA CTCTGAAG AC CTCTACAGTC GGAATGCCGG CCOGCAAGOC 1140 
AACTTCOGCA ACAAGAGAGG CCAGATGOCA GAAAACCCAT ACTCAGAGGT GGGGAAGATC 1200 
GCCAGCAAAG COGTCTACGT CCCCGCCAAG CCCGCCAGGC GGAAGGGGAT GCTGGTGAAG 1260 
CAGTOCAAOG TGGAGGACAG CCCCGAGAAG ACGTGCTCCA TCCCTATCCC GACCATCATC 1320 
40 GTGAAGGAGC CGTCCACCAG CAGCAGOGGC AAG AGCAGOC AGGGCAGCAG CATGGAG ATC 1380 
GACCCCCAGG CCCCGGAGCC ACCGAGOCAG CTGCGGCCTG ACGAAAGCCT GAOCGTCAGC 1440 
AGCCOCTTTO COGCCGCCAT CGCCGGAGCC GTCCGOGACC GTGAGAAGCG GCTGGAAGOC 1500 
AGGAGGAACT CCCCGGCCTT CCTCTCCACA G AOCTGGGGG ATGAGGATGT GGGCCTGGGG 1560 
A _ CCACCCGCCC CCAGGACGCG GOOCTCCATG TTOCCCGAGG AGGGGGATTT TGCTGACGAG 1620 
45 GACAGCGCTG AGCAGCTGTC ATCCCCCATG CCGAGTGCCA CGCOCAGGGA GOGGGAAAAC 1680 
CATTTOGTGG GTGGCGOGGA GGOCAGTGCT CGGGGTGAGG CTGGGAGGCC GCTGAATTGC 1740 
AOGTOCAAAG GOCAGGGGCC OGAGAGCAGC CCAGCAGTGC CCTCCGOGAG CAGCGGCACA 1800 
GCCGGCCCCG GGAATTATGT CCACOCACTC ACAGGGOGGC TGCTTGATCC CAGCTCCCCG 1860 
■ _ CTGGOCCTGG CACTCTCCGC AAGGGACCOA GOCATGAAGG AGTCTCAACA GGGACCCAAA 1920 
50 GGGG AGGCCC OCAAGGCGGA CCTCAACAAA CCTCTTTACA TTG ATACCAA AATGOGGGOC 1980 
AGCCTGG ATG CCGGCTTCOC TACGGTCAOC AGGCAGAACA CCCGGGGACC CCTGAGGCGG 2040 
CAGG AGACGG AGAACAAGTA CGAGACCGAC CTGGGCCGAG ACCGGAAAGG CGATGACAAG 2100 
AAGAACATGC TGATCGACAT CATGGACACG TCCCAGCAGA AGTCGGCTGG CCTGCTGATG 2160 
GTGCACACCG TGGAOGOCAC TAAGCTGG AC AAOGOOCTGC AGGAAG AGGA OGAGAAGGCA 2220 
55 GAGGTGGAGA TGAAGCCAGA CAGCTCGCOG TCCGAGGTGC CAGAAGGTGT TTCCGAAACC 2280 
GAAGGTGCTT TACAG ATCTC CGCTGCCCCC GAGCCCACCA CCGTGCCCGG CAGAACCATC 2340 
GTCGCGGTGG GCTCCATGG A AG AGGCGGTG ATTTTGCCAT TCCGCATCCC TCCTCCCCCT 2400 
CTGGCATCCG TGGACTTGGA TGAGGATTTT ATTTTTACAG AGCCATTGCC TCCTCCCCTC 2460 
GAATTTGCAA ATAGTTTTGA TATCCCCGAT GACCGGGCAG CTTCTGTCCC GGCTCTCTCA 2520 
60 GACTTAGTGA AGCAGAAGAA AAGCGACACC CCTCAGTCCC CTTCGTTGAA CTCCAGCCAA 2580 
CCAACCAACT CTGCAGACAG CAAGAAGOCA GCCAGTCTTT CAAACTGTCT GCCTGCCTCA 2640 
TTCCTGCCAC CCCCTG AAAG CTTTGACGCC GTCGCCGACT CTGGG ATCG A GG AGGTGG AC 2700 
AGCCGGAGTA GCAGCG ACCA CCACCTCG AG ACGACCAGCA CTATCTCCAC CGTGTCTAGC 2760 
_ ATCTCCACCC TGTCTTCCGA AGGTGGAG AG AATGTGGACA CCTGCACAOT CTATGCAGAT 2820 
65 GGGCAAGCAT TTATGGTTGA CAAACCCCCA GTACCTCCTA AGCCAAAAAT GAAGOCCATC 2880 
ATTCACAAAA GCAATGCACT TTATCAAGAC GCGCTCGTGG AAGAAGATGT AGATAGCTTT 2940 
GTTATCCCCC CGCCCGCTCC CCCGCOCCCG CCGGGCAGTG CCCAGCCTGG GATGGCCAAG 3000 
GTTCTCCAGC CAAGGAOCTC CAAGTTGTGG GGCGACGTCA CAGAG ATCAA AAGCCCGATT 3060 
_ _ CTCTCAGGCC CAAAGGCAAA CGTTATTAGT GAATTGAACT CTATCCTACA GCAAATGAAC 3120 
70 CGAGAGAAAT TGGCAAAGCC GGGGGAAGG A CTGGATTCAC CAATGGGAGC CAAGTCCGCC 3180 
AGOCTCGCTC CAAGAAGCCC GGAGATCATG AGCACCATCT CAGGTACACG GAGCACGACG 3240 
GTCACCTTCA CTGTTCGCCC CGGCACCTCC CAGCCCATCA CCCTGCAGAG CCGGCCCCCC 3300 
GACTATG AAA GCAGGAOCTC AGGAACAAG A CGTGCOCCAA GCOCTGTGGT CTCGCCAACA 3360 
GAGATGAACA AAGAGACCCT GCCCGCCCCC CTGTCTGCTG CCACCGCCTC TCCTTCTCCC 3420 
75 GCICICTCAG ATGTCTTTAG CCTTCCAAGC CAGOCCCCTT CTGGGGATCT ATTTGGCTTG 3480 

AACCCAGCGG GACGCAGTAG GTCGCCATCC COCTCGATAC TGCAACAGCC AATCTCAAAT 3540 
AAGCCTTTTA CAACTAAACC TGTCCACCTG TGGACTAAAC CAG ATGTGGC CGATTGGCTG 3600 
GAAAGTCTAA ACTTGGGTG A ACATAAAGAG GCCTTCATGG ACAATGAG AT CGATGGCAGT 3660 
CACTTACCAA ACCTGCAGAA GGAGGACCTC ATCG ATCTTG GGGTAACTCG AGTCGGGCAC 3720 
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AGAATCAACA TAGAAAGGGC TTTGAAACAG CTGCTGGACA GATAAGGACG GCTGCTCTOC 3780 
ACCTCGCAG A CTGCTCTTGT TATAAGTAGA GATGGGCTCG TGCTGAAACA TCTG AATGCC 3840 
AAGCG AAGTC TGTG AGCATC AACCCCACTC CATGGGTTTG TCTCCTGGTA CCCAAAGAAA 3900 
TACTGAGTTO TGTCCACAAC ATGGCTGGOT CTTCAGACCC CTGGCTCACC ATGTGGGTOT 3960 
CTK3GGCAGT TTCTATCACA CATGGGACAA GGGQAGGGAG TTTTTCTAAC ATGGAAAAAG 4020 
ATTCCCAGCC TGCCGCCCAG CATGCAGGTG GOCTDGCTTT GCCGGGTCCG AGAGGCTOCC 4080 
CGTCAATTTT GCACGGGATC CTAGCTCTTG TAGGCAGACA CCAGTGCACT CTAGATACCT 4140 
CCTGAG ACCT CCGTCCTCTG CTTTOCGGGC AGCTCTCACC AOCCCAGGCC OOGGCATGAG 4200 
CC LT1 1 C C 1C AGTCCTGTGG CCTCTCAGAG OACACCTGAT GCTCAOCTGC OOCTCTTTCT 4260 
CCTGCACITO GCTTGCAGTG AGATGCTOCC AG ATGCATTT GTOCAGTGCC CCATCATGGG 4320 
CCIOAAAGGC AGAGAAACTT TTTCCTACAC AGATTCTTTT CCOCATCTCC TCCTGTGGTT 4380 
TGCATCCATG CCTCTTTGGC CATG AGGTTC CTGGCAGTGC TGGGAGTTTG GATGGGATCG 4440 
TGOOCAGCTT TGCTTAG C 1 1 TCT1 1 A TTIC TGCAAATCTO TTAGCATAAT TCCAAGGTGG 4500 
CCAAGCAG AT GTCACATGGA GTTAGTCAAA GCACAAAGTC ACGATTCCAC AATGG AGGGG 4560 
AGACCTGGCC AAGGG AGCCA GOCAGCGTGC AACTGCCCAA GCTCCAGGTC TCCAGGACAA 4620 
GAGCAGTTGT CTGCCATGAG CACCCATCCA GGATGG AG AA TAAGGGCTTC TCTGCCTCTC 4680 
AGAATTCTTT TTAATTGAAG ATGTCTTGAG CTCTGCAAAG ATCAGAGCAG GTGAGCATCC 4740 
ACTTTGACAT G AAGGACAAG AAGACGCATG GCTCATGGCG GGCACATGOG GCTGCCAGTG 4800 
AGACAGCGTC TCCTCTGGGA GCTGGGCGGG CACAGCATCC TCAGTTCTGT GOCCAGOCAA 4860 
GGGTG AGCAT CTCTGCTGAG ACAGTCCTTT TGCTCTOGGA GGCCAGGG AA GATGGTACTT 4920 
AGAGGCTTTT CCOCTATCX3C TCTGGGTGTC TAGGAATCOC ACCAGCTTGT CTTAACAGTA 4980 
CAACAGCTTC TTTGAGGACC CAGTGGGTAT GGAGTATAGA CAGAAOCCAG GGTTGAGAAC 5040 
AGAAGGTGGG OGGCAGGATC AGAGTGAAAG CAGAGGCGTO AGGAGAGGAA AGCAGGGAGG 5100 
TCTCCTGGOC TGOCAGGTCA GCCTCTCrGG CAAGGCTTTC TTGAGGOCCG CCOLTITCIT S160 
TCCCCGGAGT CCCTCCACCC CATAACAATA CCTCGAATTT CCAAAAG AGG TCACCAG ATG 5220 
CACATGGGCC GCAAAACACA CAGTCAGGCT TCCAGCACAT TCTCCCCCAT TTGGAGGATA 5280 
CTCGAATGTC A G O U 1 1 IP G TTTTATTATT ATTTCAGAA C TAGCTCAGCC CATCTCTAAT 5340 
TATAAAACAT GGTTTTGTTT TTTTTTTTTC CTTTTTTTCT TGATTAGGTC TGGAACAGCT 5400 
CTAGAATGAA CACATAAAAT TTAGCAATTT AAAATCTTTC TTTACTGCAA GTTTAAATAG 5460 
TTGTACAG AT AGTTTATAAG CACAATATTT TAAG AAAAAA AAGTGGCTGG TCTACT AGGC 5520 
AGOCTTTGTG CCACTTCAGT GCTAGAAAGT TAAAGAAAAA AAAACTTTTG TG ATTTAATA 5580 
ATACTATTTC TGTGGAATAA TTATAAAAGT ATGAOCTTTT TAAATCAAOC TTATTTGGAT 5640 
GCATCTG AAC CAGCAGAGCT GTGTTATATT TICTATCTTT GCTAG AACXT CXTTCATTG AA 5700 
GGACAATTTC TTCAAAGTGG TT ACAATTCA TAATGCAGCA OTTTCTO CAA AAACAAAAAC 5760 
AAAACACACA CCACACACAC GOGCTTTTOC AGTCACACAC COCTGATGTT GGAAGCAAGT 5820 
TTTTGGACCT TCTGTTOCAA AACCTTTTGC AGGTCAATCT TTGTATTTGA AATGATCCAA 5880 
TCCAACTTGA AGTCAATTGA ATATTAAGGC GCTTTACTTC CGTGTGCTTT CAGTTTTTCC 5940 
ATCATG AG AT GAATG AGCAT TACTCTAGAT AAATTTCAAG ACAGGATACT ACAGGTGGCC 6000 
TGCIG AGGCT GCCCCATATT TTAGAAAATG TAAAAATGGT GGTTTGGCCA TTAATTTGTC 6060 
TTCCATTTGA TGATAOCGCA AAATTOCGTG AGTCCATTCC TTTGGCATGG CACTTIOCCT 6120 
GGGOCTACAG TTGGTATTAC CTCTGTGCTC AGTGCCAGGC AAAACACTAG CTCAAAGGAG 6180 
AGTCAAGGAA ACCGCTGGCA GACGATAAGC AGTOGAAACT OGTGACTTCG GTTTGTTGAA 6240 
C1TTGGCAGC CAGTTGGTGA GGGCCAGATO TT AT T COCTT T C T T AAAGAT ACTCCAAGCC 6300 
ACATGCCACT AAOCACAAGC AAGCTGGCTG CAAGACTAAA GAGCTGATAA CATAGTTTAT 6360 
TTTTACACTG TCTTATTATA GAG AAGTAAT AGACCTATCA GAACCTGCAC TGACCAACAA 6420 
ATAAACACAT GTTGCCAAGA TGAATCGGTC TCTATCTCTA TCTGCTTATT TTGGTACTGA 6480 
AAGCAATAGT TCCTCATTCA AATCACCACC CACTGTTCTC CCCCTITGGG ACATGTTAGG 6540 
ACG AGGCCCT ATTCCATGCC CCTCTTTAAT GGTGGAACAA ATGTTAAACT GCICATCTAA 6600 
AGATCATGTT GATATTATTC CAGGTTTTAA GATCAACTTT TCTTACATAC TGTAATTTAA 6660 
ATAAACTGCA TTTACATGCC TAGTTTCTGT AATATTGTOT ATACAAAAGC CAAATCTCTC 6720 
AAAATGT AAA TTATCTATAC CTGOCAAGAT AOCTTITOCA GGGTGTCTGC GCACATTTT A 6780 
AGTTAATTCA CATAATATAA AAATTACTCA ATGTGACTGT TQATTTGCTG AACTTTACAT 6840 
ATCACAAAGT GAATTATTTG TG ATACTTTA GTTAATAAAA TGGTAAATTT TTTTCTCAGT 6900 
TATTGAACAA GCAAGCATTA TDCAGTTGAT CTGGCAATG A CM 1 1 1U 1 U T GTG GGCCACA 6960 
ATATTGATTT TCOCATTAAC AAl Mil ill iU 1 1 1 il l AA ATACTAATAT GTTTCACACT 7020 
ATAGTTTGTG TAACAACACG TCTTCGCATT ATCTATU11U CTOTTACTTT TGTGCTTTTA 7080 
TTCTTTTTAG ACTTTATAAA AAAAAAAAAA AGCTCCTGTA ATTTGCACTrTCTOCCAATC 7140 
CTTAAATCTC TTGTATGGCA ACCAAAATTA CTGTAAAAAA ATAAATATAC TATTGCACTA 7200 
AGGTTGTGGT TCTG ATTGCA AACAAACAGT GAACACTGTC TGAATTAAAC AAAAAGCTGC 7260 
CCGACTTGCA ATCTAATGTA GATTATCTCA GGCATTGTGG CCAGCTCTGC CTCTCTAAAA 7320 
CTGACCAGAA AAATCTCTCT CATCGAGTAA ACAGGCTCCT GTCACTGAGC TAATCTGCCT 7380 
TGGTTCCATT TCCTTATTCT CAATTTATCA ATGGATACGT GCATGTTATT TCAGAATTAT 7440 
GCAAAACGTC AAAATCTGCT TCTGTGACCG CTGCT ATAGG CGTGGAGCTG AGGCTCGGCT 7500 
nTCCTTTTG TTCTGGGTGG AAGCAGCGGT GCCGCGGAGG GCCAGCCAGA TCCGGACCCT 7560 
TCCCTTAGGG TCCAGTCTCC CCACACCCCA GCAGGGTGTC TTCTAGCCAT AAGGCCAAGG 7620 
GAGTGGCAGA ACTGGGCCGC CICTCTGGTT GACAAGCAAA CCACATGCTA AGGCTTGGAG 7680 
CAAG AGAG AA TTTGTGTCTA TTGGCAAAOA ACTAAGCCAG G AAG ACATGG GCCATCCCTC 7740 
GGCTTTAGGG AAGCATATTT TAAACCTAAA OGTTG AACTT CTTCTTTGGC CTCACCAGTG 7800 
AAAA C1 1GTT GTCTTTAGTT OCTAAAGTTT CTTCTACTTT GGCACATTCC CCAGTTGAGC 7860 
AGCAGCCTCT ATGCTTCCAC GTTCAGG AAA AATTCCAGTC CTCATATCTT TTGTAGTTCA 7920 
OOCrCAAGCrcrCCOGCnC ACCATOCAAT A Ol 1 1C1U X AAACCrrOGC ACCOCCCTAG 7980 
ACTTTGCTTC CAATGGTTTC TTCCAGACCA CUI IC CTAG ATGAATATAT TOGTTTAOCT 8040 
TACTAGGAAA ATTATTGGAA GATTTTTIUT TTTACTTGAA ATTGGAGGCA TTTTAATAAC 8100 
TGGOGAACTG GAATOTCTTT CltTTATTrGT AG ACAAOCAT GTAOOCATGC AAGTAGGTGA 8160 
ACATTOCACA GTGGCTGGGT GACCACAGCA GCTGCATGCA GACAGGACTG CC Cb 1UCHT 8220 
GTCGGGAATC AGAGAATTIC CAAA CnUM TCTCAGACTT OCX5CAGATCT CATCACTTTG 8280 
ATTTCTAATC CATGCTGTAT TGGTGATTTT GTTTATCGTT OCTGTAACTT GTTCTACATT 8340 
CCACAGTCTT TACOGTTTTA TGTTCAAAAT TACAACAATC CCTGTOCATT GATTOCACTC 8400 
TGGAACTCTT TGnCATCCC AATTTTG AAA TTTTAATACX3 AGCCTTCAAA TAAACACAGA 8460 
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AAAGAAAAAA AAAAAAAAAA AAAAAAAA 



3EQ ID Hfr16S PEZ9 Pfgten gfflencg; 
Protein Accession #: BAA82974.1 



1 11 21 31 41 51 
I I I I I I 

MMMNVPGGGA AAVMMTGYNN GRCPRNSLYS DCnEEKTW LQKKDNEGPG FVLRGAKADT 60 
PTEEFTPTPA FPALQYLESV DfiGGVAWQAG LRTGDFUEV NNENWKVGH RQWNMRQG 120 
GNHLVLKWT VTRNLDPDDT ARKKAPPPPK RAPTTALTLR SKSMTSELEE LVDKDKPEEI 180 
VPASKPSRAA ENMAVEPRVA TKQRPSSRC FPAGSDMNS V YERQGIAVMT PTVPGSPKAP 240 
FLGIPRGTMR RQKSIDSRIF LSGHEEERQ FLAPPMLKFT RSLSMPDTSE DIPPPPQSVP 300 
PSPPPPSPTT YNCPKSPTPR VYGTTKPAFN QNSAAKVSPA TRSDTVATMM REKGMYFRRE 360 
LDRYSLDSED LYSRNAGPQA NFRNKRGQMP ENPYSEVGK3 ASKAVYVPAK PARRKGMLVK 420 
QSNVEDSPEK TCSIPffTO VKEPSTSSSG KSSQGSSMEI DPQAPEPPSQ LRPDESLTVS 480 
SPFAAAIAGA VRDREKRLEA RRNSPAFLST DLGDEDVGLG PPAPRTRPSM FPEEGDFADB 540 
DSAEQL5SFM PSATPREPEN HFVGGAEAS A PGEAGRPLNS TSKAQGPESS PAVPSASSGT 600 
AGPGNYVHPL TGRLLDPSSP LALALSARDR AMKESQQGPK GEAPKADLNK PLYIDTKMRP 660 
SLDAGFPTVT RQNTRGPLRR QETENKYETD LGRDRKGDDK KNMUDIMDT SQQKSAGLLM 720 
VHTVDATKLD NALQEEDEKA EVEMKPDSSP SEVPEGVSET EGALQEAAP EPTTVPGRTI 780 
V A VGSMEEA V ILPERIPPPP LAS VDLDEDF IFTEPLPPPL EFANSFDIPD DRAAS VPALS 840 
DLVKQKKSDT PQSPSLNSSQ PTNSADSKKP ASLSNCLPAS FLPPPESEDA VADSGIEEVD 900 
SRSSSDHHLE TTSTISTVSS ETLSSEGGE NVDTCTVYAD GQAFMVDKPP VPPKPKMKP1 960 
IHKSNALYQD ALVEEDVDSF VIPPPAPPPP PGS AQPGMAK VLQFRTSKLW GDVTEKSPI 1020 
LSOPKANVIS ELNSILQQMN REKLAKPGEG LDSPMGAKSA SLAPRSPHM STISGTRSTT 1080 
VTFTVRPGTS QPITLQSRPP DYESRTSGTR RAPSPWSPT EMNKETLPAP LSAATASPSP 1140 
ALSDVFSLPS QPPSGDLFGL NPAGRSRSPS PSILQQPISN KPFTTKPVHL WTKPD V AD WL 1200 
ESLNLGEHKE AFMDNBDGS HLPNLQKEDL IDLGVTRVGH RMNffiRALKQ LLDR 



SEQ ID N0:166 PEZ4 DNA SEQUENCE 

Nucleic AcW Accession #: NM.000O24 

Coding sequence: 220-1461 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I II 

ACTGCGAAGC GGCTTCTTCA GAGCAGGGGC TGGAACTGGC AGGCACCGCG AGGCOCTAGC 60 
ACCCGACAAG CTGAGTGTGC AGG ACGAGTC CCCACCACAC CCACACCACA GCCGCTGAAT 120 
GAGGCTTCCA GGCGTCCGCT CGCGGCCCGC AGAGCCCCGC CGTGGGTCCG CCCGCTGAGG 180 
CGCCCCCAGC CAGTGCGCTT ACCTGCCAGA CTGCGCGCCOG^GGCAACC CGGGAACGGC 240 
AGCGCCTTCT TGCTGGCACC CAATAGAAGC CATGCGCOGG ACCACGACGT CACGCAGCAA 300 
AGGGACGAGG TGTGGGTGGT GGGCATGGGC ATCGTCATGT CTCTCATCXjT OCTGGOCATC 360 
GTGTTTGGCA ATGTGCTGGT CATCACAGOC ATTGCCAAGT TCGAGCGTCT GCAGAOGGTC 420 
ACCAACT ACT TCATCACTTC ACTGGCCTGT GCTG ATCTGG TCATGGGOCT GGCAGTGGTG 480 
CCCTTTGGGG CCGCCCATAT TCTTATGAAA ATGTGGACTT TTGGCAACTT CTGGTGCGAG 540 
TTTTGG ACTT CCATTGATGT GCTGTGCGTC ACGGCCAGCA TTGAGA(XCT GTGCGTGATC 600 
GCAGTGGATC GCTACTTTGC CATTACTTCA CCTTTCAAGT ACCAG AGCCT GCTGACCAAG 660 
AATAAGGCCC GGGTGATCAT TCTGATGGTG TOG ATTGTGT CAGGOCTTAC CTCCTTCTTG 720 
COCATTCAG A TGCACTGGTA CCGGGCCACC CACCAGGAAG CCATCAACTG CTATGCCAAT 780 
GAGAOCTGCT GTG ACTTCTT CAOGAACCAA GCCTATGCCA TTGCXTTCTTC CATCGTGTCC 840 
TTCTACGTTC CCCTGGTGAT CATGGTCTTC GTCTACTCCA OGGTCTTTCA GGAGGCCAAA 900 
AGGCAGCTCC AG AAG ATTG A CAAATCTG AG GGCCGCTTCC ATGTCCAGAA CCTTAGCCAG 960 
GTGGAGCAGG ATGGGCGGAC GGGGCATGG A CTCCGCAGAT CTTCCAAGTT CTGCTTGAAG 1020 
GAGCACAAAG CCCTCAAGAC GTTAGGCATC ATCATGGGCA CTTTCAOOCT CTGCIGGCTG 1080 
COCTTCTICA TCGTTAACAT TGTGCATGTG ATOCAGGATA ACC1CATCCG TAAGGAAGTT 1 140 
TACATCCTCC TAAATTGG AT AGGCTATGTC AATTCTGGTT TCAATCCOCT TATCTACIGC 1200 
CGGAGCCCAG ATTTCAGGAT TGCCTTCCAG GAGCTTCTGT GCCTGCGCAG GTCTrCTTTG 1260 
AAGGCCTATG GGAATGGCTA CTCCAGCAAC GGCAACACAG GGG AGCAGAG TGGATATCAC 1320 
GTGGAACAGG AGAAAGAAAA TAAACTGCTG TGTGAAGACC TCCCAGGCAC GGAAGACTTT 1380 
GTGGGCCATC AAGGTACTGT GOCTAG CGAT AACATTGATT CACAAGGGAG GAATTGTAGT 1440 
ACAAATGACT CACTGCT GTA AA GCAGTTTT TCTACTTTTA AAGACCCCCC CCCCCCCAAC 1500 
AGAACACTAA ACAGACTATT TAACTTGAGG GTAATAAACT TAGAATAAAA TTGTAAAAAT 1560 
TGTATAG AGA TATGCAGAAG GAAGGGCATC CTTCTGCCTT TTTTA U 1 11 TTAAGCTGTA 1620 
AAAAGAGAGA AAACTTATTT GAGTG ATTAT TTGTTATTTG TACAGTTCAG TTCCTCTTTG 1680 
CATGGAATTT GTAAGTTTAT GTCTAAAG AG CTTTAGTOCT AGAGG ACCTG AGTCTGCTAT 1740 
ATTTTCATGA CTTTTCCATG TATCTA CCTC A CT ATTCAAG TATTAGGGGT AATATATTGC 1800 
TGCTGGTAAT TTGTATCTGA AGGAGATTTT CCTTCCTACA CCCTTGG ACT TG AGGATTTT 1860 
GAGTATCTCG G ACCTTTCAG CTGTGAACAT GGACTCTTCC CCCACTCCTC TTATTTGCTC 1920 
ACACGGGGTA TTTTAGGCAG GGATTTG AGG AGCAGCTTCA GTTGTTTTCC CGAGCAAAGG 1980 
TCTAAAGTTT ACAGT AAATA AAATGTTTGA OCATG 



SEQ S> 110:167 PEZ4 Protefn sequence! 
Protein Accession #: NP.000015.1 



11 21 31 41 51 
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I I I I I I 

MGQPGNGS AF LLAPNRSHAP DHDVTQQRDE VWWGMGIVM SUVLAIVFG NVLVTTAIAK 60 
FERLQTVTNY FITSLACADL VMGLA WFFG AAHILMKMWT PGNFWCEFWT SID VLCVTAS 120 
ETLCVIAVD RYFAITSPFK YQSLLTKNKA RVIILMVWIV SGLTSFLPIQ MHWYRATHQE 180 
AINCYANETC CDFFTNQAYA IASSIVSFYV PLVIMVFVYS RVFQEAKRQL QHDKSEGRF 240 
HVQNLSQVEQ DGRTGHGLRR SSKFCLKEHK ALKTLGIIMG TFTLCWLPFF IVNIVHV1QD 300 
NURKEVYIL LNWIGYVNSG FNPLIYCRSP DFRIAFQELL CLRRSSLKAY GNGYSSNGNT 360 
GEQSGYHVEQ EKENKLLCED LPGTEDFVGH QGTVPSDN1D SQGRNCSTND SLL 



SEQ m UQim PEZ1 DNA SEQUENCE 

rtoic Add Accession ft NMJJ04457 

Cooing sequence: 143-2305 (underlined sequences correspond to start and stop cottons) 



1 11 21 31 41 51 
I I I I I I 

GAATTGGTTG TTGGGAAGGA CTGGGGAAAC AGCTGTAACA TtTGCCACCC TCAG AAGCTG 60 
CTGGTCCTGT GTCACACCAC CTTAGOCTCT TGATCGAGGA AGATTCTCGC TGAAGTCTGT 120 
TAATTCTACT TTTTGAGTAC TT AJQQAATAA CCAOGTGTCT TCAAAACCAT CT AOCATGAA 180 
GCTA AAACA T ACCATCAACC CTATTCTTTT ATATTTTATA CATTTTCTAA TATCACTTTA 240 
TACTATTTTA ACATACATTC CGTTTTATTT TTTCTCCGAG TCAAG ACAAG AAAAATCAAA 300 
CCGAATTAAA GCAAAGCCTG TAAATTCAAA ACCTGATTCT GCATACAG AT CTGTTAATAG 360 
TTTGGATGGT TTGGCTTCAG TATTATACOC TGGATGTGAT ACTTTAGATA AAGTTTTTAC 420 
ATATGCAAAA AACAAATTTA AG AACAAAAG ACTCTTGGG A ACACGTG AAG TTTTAAATGA 480 
GGAAGATG AA GTACAACCAA ATGGAAAAAT TTTTAAAAAG GTTATTCTTG GACAGTATAA 540 
TTGGCnTCC TATGAAG ATG TCTTTGTTCG AGCCTTTAAT TTTGGAAATG GATTACAGAT 600 
GTTGGGTCAG AAACCAAAGA CCAACATCGC CATCTTCTGT GAGACCAGGG OCGAGTGGAT 660 
GATAGCTGCA CAGGCGTGTT TTATGTATAA TTTTCAGCTT GTTACATTAT ATGCCACTCT 720 
AGGAGGTCCA GOCATTGTTC ATGCATTAAA TGAAACAGAG GTGACCAACA TCATTACTAG 780 
TAAAGAACTC TTACAAACAA AGTTGAAGGA TATAGTTTCT TTGGTCOCAC GOCTGOGGCA 840 
CATCATCACT GTTGATGGAA AGCCACCGAC CTGGTCCGAC TTCCCCAAGG GCATCATTGT 900 
GCATACCATG GCTGCAGTGG AGGOCCTGGG AGOCAAGGGC AGCATGG AAA AOCAACCTCA 960 
TAGCAAACCA TTGCCCTCAG ATATTGCAGT AATCATGTAC ACAAGTGGAT CCACAGGACT 1020 
TOCAAAGGGA GTCATG ATCT CACATAGTAA CATTATTGCT GGTATAACTG GG ATGGCAGA 1080 
AAGG ATTCCA GAACTAGG AG AGGAAG ATGT CTAC ATTGG A TATTTGOCTC TGGOCCATGT 1140 
TCTAG AATTA AGTGCTOAGC TTGTCTGTCT TTCTCACGGA TGGCGCATTG GTTACTCTTC 1200 
ACCACAGACT TTAGCAGATC AGTCTTCAAA AATTAAAAAA GGAAGCAAAG GGGATACATC 1260 
CATGTTGAAA OCAACACTGA TGGCAGCAGT TCCGGAAATC ATGGATCGGA TCTACAAAAA 1320 
TGTCATGAAT AAAGTCAGTO AAATGAGTAG TTTTCAACGT AATCTGTTTA TTCTGGCCTA 1380 
TAATTACAAA ATGGAACAGA TTTCAAAAGG ACGTAATACTCCACTGTGCG ACAGCTTTGT 1440 
TTTCCGGAAA GTTOG AAGCT TGCTAGGGGG AAATATTCGT CTCCTGTTGT GTGGTGGOGC 1500 
TCCACTTTCT GCAACCACGC AGCGATTCAT GAACATCTGT TTCTGCTGTC CTGTTGGTCA 1560 
GGGATACGGG CTCACTG AAT CTGCTGGGGC TGGAACAATT TCCGAAGTGT GGG ACTACAA 1620 
TACTGGCAGA GTGGGAGCAC CATTAGTTTG CTGTGAAATC AAATTAAAAA ACTGGGAGGA 1680 
AGGTGGATAC TTTAATACTG ATAAGOCACA CCCCAGGGGT GAAATTCTTA TTGGGGGCCA 1740 
AAGTGTGACA ATGGGGTACT ACAAAAATG A AGCAAAAACA AAAGCTGATT TCTCTGAAGA 1800 
TGAAAATGGA CAAAGGTGGC TCTGTACTGG GGATATTGGA GAGTTTGAAC CCGATGGATG 1860 
CTTAAAG ATT ATTGATCGTA AAAAGGACCT TGTAAAACTA CAGGCAGGGG AATATGTTTC 1920 
TCTTGGGAAA GTAGAGGCAG CTTTGAAGAA TCTTOCACTA GTAGATAACA TTTGTGCATA 1980 
TGCAAACAGT TATGATTCTT ATGTCATTGG ATTTGTTGTG OCAAATCAAA AGGAACTAAC 2040 
TGAACTAGCT CG AAAGAAAG GACTTAAAGG GACTTGGGAG GAGCTGTGTA ACAGTTGTGA 2100 
AATGGAAAAT GAGGTACTTA AAGTGCTTTC CGAAGCTOCT ATTTCAGCAA GTCTGG AAAA 2160 
GTTTGAAATT CCAGTAAAAA T TO GTTT G AO TCCTGAACCG TGGACCCCTG AAACTGGTCT 2220 
GGTG ACAGAT GCCTTCAAGC TG AAACGCAA AGAGCTTAAA ACACATTACC AGGCGGACAT 2280 
TGAGOGAATG TATGGAAGAA AATA ATTATT CTCTTCTGGC ATCAGTTTGC TACAGTGAGC 2340 
TCACATCAAA TAGGAAAATA CTTGAAATGC ATGTCTCAAG CTGCAAGGCA AACTOCATTC 2400 
CTCATATTAA ACTATTACTT CTCATG ACGT CACCATTTTT AACTG ACAGG ATT AGT AAAA 2460 
CATTAAG ACA GCAAACTTGT GTCTGTCTCT TCTTTCATTT TCCCCGCCAC CAACTTACTT 2520 
TACCACCTAT GACTGTACTT GTCAGTATG A GAATTTTTCT GAATCATATT GGGGAAGCAG 2580 
TGATTTTAAA ACCTCAAGTT TTTAAACATG ATTTATATGT TCTGTATAAT GTTCAGTTTG 2640 
TAACTTTTTA AAAGTTTGGA TGTATAGAGG GATAAATAGG AAATATAAGA ATTGGTTATT 2700 
TGGGGGCTTT TTTACTTACT GTATTTAAAA ATACAAGGGT ATTGATATGA AATTATGTAA 2760 
ATTTCAAATG CTTATGAATC AAATCATTGT TG AACAAAAG ATTTGTTGCT GTGTAATTAT 2820 
TGTCTTGTAT GCATTTGAGA GAAATAAATA TACCCATACT TATGTTTTAA G AAGTTGAGA 2880 
TCTTGTG AAT ATATGCCTGT CAGTGTCTTC TTTATATATT TATTTTTTAT TAG AAAAAAT 2940 
GAAGTTTGGT TGGTGATGCA TG AAACAAAA TAGCAAG AGA GGGTTATAGT TTAATAGTAA 3000 
GGGAGATAAC ACAGCATGTG TAGCACCAGT TGATAATTGG TCTCTAGTAG CTTACTGTCA 3060 
AAATGTTCAA TGAAGTCTTC TGTTCATCTG TTGAAACTAG GAAAATACCC AAACTTAAAT 3120 
GG AAGAATTC TGAAAGAGAG GATAGAATTT AAAGAACAAG AGTATATAAA GTTATTCTTT 3180 
GAATATTTCG TTGACTATAT GTACATTG AG TTATCTATAT TTGTAAACAA ATTAGTCATG 3240 
GAAAA TTATT CTATTCCAAA GTCTCCTTTT AGTCTAG ATA ATCATTATTT CATTTTAAAA 3300 
TTAGTGTTTT TCATAGTTTG CACTGATGCG TGTATGGATG TGTGTGAGTC AGTGGTAGCT 3360 
TATTTAAAAA GCAOCTTATC CTTTCTCCCA TAACCTTTGT ACACTAAAAA ATG AAAGAAT 3420 
TTAGAATGTA TTTGATGATA GCATTCTCAC TAAGACACAT GAGAATTTAA CTTTATAACC 3480 
GOGTGAGTT A AGATTTAATT CATAGGTTTT GATGTCATTO TTGAAGTTAT TTGTAATTCA 3540 
GAAACCTTGC TTGTGTGAT A CATAGT AAGT C1CTTCATTT ATTACTGCTT GCCTGTTGTT 3600 
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5 
10 
15 
20 
25 



ATATCTGG AT TATCAAAAGC AATAGTGCAC CAATTAAGAT GTGCTCAAAT CAGGACTTAA 3660 
ATCAT AGGCA CCACATTTTT CATGTCAG AC TAGTTACTTr GTTGATTCTC AGTTACTGTA 3720 
GGCATCAAAA GGCAAAAATC A 



SEOtDNO:169] 

Protein Accessio n ft 



NP_00444ai 



II 



31 



41 



51 



21 

I I I I I I 
MNNHVSSKPS TMKLKHTINP ILLYFIHFU SLYTTLTYEP FYFFSESRQE KSNRIKAKPV 60 
NSKPDS AYRS VNSLDGLAS V LYPGCDTLDK VFTYAKNKFK NKRLLQTREV LNEEDEVQPN 120 
GKIFKKVUjG QYNWLSYEDV FVRAFNFGNG LQMLG QKPKT NIAIFCETRA EWMIAAQACF 180 
MYNPQLVTLY ATLGGPAIVH ALNETEVTNI ITSKELLOTK LKDIVSLVPR LRHHTYDGK 240 
PPTWSDFPKG HVHTMAAVE ALGAKASMEN QPHSKPLPSD IAVIMYTSGS TGLPKGVMIS 300 
HSNIIAGITG MAERIPELGE EDVYH3 YLPL AHVLELSAEL VCLSHGCRIG YSSPQTLADQ 360 
SSKDCKGSKG DTSMLKPTLM AAVPEIMDRI YKNVMNKVSE MSSFQRNLFI LAYNYKMEQI 420 
SKGRNTPLCD SFVFRKVRSL LGGNIRLLLC GGAPLSATTQ RFMNICRXP VGQGYGLTES 480 
AGAGTISEVW DYNTGRVGAP LVCCHKLKN WEEGGYRfFD KPHPRGEDLI GGQSVTMGYY 540 
KNEAKTKADF SEDENGQRWL CTGDIGEFEP DGCLKHDRK KDLVKLQAGE YVSLGKVEAA 600 
LKNLPLVDNI CAYANSYHSY VIGFWPNQK ELTELARKKG LKGTWEEUCN SOEMENEVLK 660 
VLSEAAIS AS LEKFEtPVKI RLSPEPWTPE TGLVTDAFKL KRKELKTHYQ ADEERMYGRK 

SEQ ID NO: 170 PCQ7 DNA SEQUENCE 

none found 

38-1075(undetf ned sequence corresponds to start and stop codon) 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



A6CAACGAC6 
CCTGCTGCTG 
GTGCAACATA 
GTGTGACGGG 
GTCGAAATGT 
CTTCCGGTGC 
AAACCCTCTG 
GAGCTTCATC 
AAGTTCTCAA 
TTACCCCAGC 
CCTGCTGGCA 
GCACCGGCTG 
CTGCAACGTC 
GAATGCQTCQ 
T6CGT66TAT 



GGGCACTGCT 
AGTTATTCCA 
TGCTCATGGG 
AACTATCTCT 
TGACATQATC 
CACCCTCATT 
AAATAGGCTG 
CGCTGGACCC 
ATGATCTAAC 
ATCAAAAOCT 
AAGAAAACTT 
AAGGACTCTG 
CTCATTCTGA 
GAGCCCCTOC 
TACACCTGCC 
ACCTGCCCGT 
GTATGTCCCT 
CTCCAAAGTT 
ACTGGTTTCT 
CTGCACTGTG 
GGTCAGGGTC 
AGACAATTTG 
TGAAACAGTQ 
AGCTGTCTCT 
ACACCCTTGC 
ACATTTGTGC 
AGAGGGACTC 
TTCTCTGTGT 
AGCTCTTOTT 
CCACICOGGG 
AACCTOTTTO 
TGATCCTGTT 



11 

I 

CCGGGCAGCG 
AGCAGCGCCG 
CCAGGCAACT 
CTGCCTGACT 
GGCCCAACCT 
AATGGGTTTG 
CTTTGCTCCA 
TGCGATGGAC 
GAACCCGGCA 
ATCACCTATG 
CTGGTCTTGC 
CAGCACCCT6 
ACCTACAACG 
GAAQTAGGCT 
GACCTTCCTC 
CGCTACCGC? 
CTGAGCGTGG 
GAGCCCAGGO 
AAGTCCATAT 
AAGCTCTTTA 
GGATTCCCCT 
TOTTGTGCGT 
TTTCACATTA 
GGAGAGAGCA 
AATTCTCTCT 
CAGGAGGCCA 
GCTTTGCACA 
TGGACGTGAG 
AAACCATCTA 
GAGCTTTCCT 
CATGAGTTTA 
CTGGCTCTAC 
AGCCAAGGAA 
GTGGCCCACA 
CCCTTAACAC 
ATCACAGGTG 
CACGCTCCTC 
AGGOCTCTCC 
GAGTCAAGAT 
TGTTTGTTTT 
TTTTTTGTTT 
CCCGCTGAGC 
ATTGTTGCAC 
CTCTCTCCCT 
CCAGTCAGCC 
TGGCAAGAAA 
CAGCTGTCAC 
ACGCTAATTA 
CTGTAGACTT 



21 

I 

GGAGCGGOGG 
CGGAGAGCCA 
TCATGTGCAG 
GCTTCGACAA 
TCTTCCCCTG 
AGGACTGTCC 
CCGCCCGCTA 
AGAATAACTG 
GTGGGCAGGT 
CCATGATCGG 
ACCACCAGCG 
TGCTGCTGTC 
TCAATAATGG 
CCCCACCCTC 
CACCGCCCTA 



31 
I 

CCGCGC CATG 
GCTGCTCCCC 
CAATGGACGO 
GAGTGATGAG 
TGCCAGCGGC 



CCACTGCAAG 
TCAAGACAAC 
GTTTGTGACT 
CAGCTCCGTC 
GAAGCGGAAC 
CCGCCTGGTG 
CATCCAGTAT 
CTACTCCGAG 
CTCTTCTGAC 



41 

I 

TGGCTGCTGG 
GGGAACAACT 
TGCATCCCGO 
AAGGAGTGCC 
ATCCATTGCA 
GATGAAGAGA 
AACGGCCTCT 
AGTGATGAGG 
TCAGAGAACC 
ATTTTT GTO C 
AACCTCATGA 
GTCCTGGACC 
GTGGCCAGCC 
GCCTTGCTGG 
AOGGAATCTC 



ACTCTOAOOC 
GGGTTAATCT 
AGCACCTGTA 
CCTCCCCCAG 
C TTTTCTGTC 
TTCTGTTTCT 
ATGTTTCTGT 
GCTGGGTAGT 



CCACAGCCCG 
CAGCCAGGGC 
GCTCTGACTT 
AGGATGTCTC 



TAACACCCTT 
CCCTGTATAA 
CAGCAQCATA 
TCCAAGTTCT 
AGCCACTTAC 
TGAGGACCTA 
CCCAGCCTGT 
TTGCAAAGTC 
AGAGCCATGT 
TTCCCAAGGT 
CAACATCCCA 
TTTCCATTTQ 
TTCCCTTCTA 
TTCCTTTAAC 
CCCGTGATAA 
TTTGAGGTTA 
CCGTGTATAG 
ACAGGGCCCG 
OCACACTGAC 
CCATTCAGAA 
AAACAGAGCC 
TTCT1TCTTC 



AGGTCACTCT 
GTTGGAGAGA 
GCTATATTGG 
TACCTTATAG 
GTCACCCCCC 
ATGCCCCCAG 
CAGCAGTCGC 
ATTCTGGCTT 
TATCATCAGC 
CAGCTCCTAA 
CTGGTTTCTG 
ACTTGAGTTG 
CTTGCTCATT 

cnrnrACCT 

TCAATACCTC 
CCCAATACCA 
GTAGTTTCTC 
GATCTATTTT 
GTTAAGGGAC 
AAGGTCCAAA 
CAAGTCACTC 
TTATTTATCA 
TCTCTATGTT 
CCTCCCTGCA 
TGATGAGGGG 
CTTCTTTOOG 
TGCAGGAAGT 



GGGCAGCCTG 
ACTGAAGAAG 
GTTGCCATTC 
AAGTTACAGT 
TQTTTTTCTO 
TCCCTOGGGA 
CAGCATATAA 
ATGCTCAGAA 
CATTTGGGGA 
CAAAAAAATT 
TTCAGCAGAG 
AACGTTA7TT 
TAGAAATTTG 
CTCATCCTAA 
AATGCAGGCT 
GACTGTCACC 
GCOCAAAGTC 
CATGCAGCCT 
GTGCATTTGG 
CAGCAAGCTC 
GCACCTCTAG 
CTCTGAGACA 
AAATCTTTTA 
TATTTATATG 
GAAAGATGCA 
CAGACTAACC 
AGTTCTTGAA 
TGTGCTAGTT 
GGAATAAGGG 
TAAAATGGAA 
CAGCTGAAGA 
GGGGCTAAAG 



51 
I 

GGCCGCTGTG 
TCACCAATGA 
GCGCCTGGCA 
CCAAGGCTAA 
TCATTGGTCG 
ACTGCACAGC 
GTATTGACAA 
AAAGCTGTGA 
AACTTGTGTA 
TGGTGGTGGC 
CGCTGCCCGT 
ACCCCCACCA 
AGGCGGAGCA 
ACCAGAGGCC 
TGAACCAAGC 
CCCAGGCAGC 
GCCCCCAGGA 
T ATAAG TCCC 
TAACAATTTG 
TTGGGATATT 
GCGTCTCAGT 
CCCGAGATCA 
AACAGTATTG 
GTGCAGGAGA 
TTTGGGTTAG 
CCATTTGAGC 
TCAGTGGCCA 
TGGTTTTGTO 
CCCAAGAATG 
AATAGGCAGG 
GOCAAGACCC 
CTCOCAGCTG 
TQACCTGGCT 
CAACACTGGC 
ACTTGAGGAC 
TCCTGGCTCC 
TTAGAGTTAG 
CATGGGCAAG 
GAAATGCATT 
TGTATAGGAA 
AAAGGAGATC 
TGTGTGCCAG 
GGAAGCAGAA 
rETCTTTTTT 
GTAAAACGTT 
CCAGGTAGAG 
AATGTTCAGT 
TGGCATTCAG 
TGTTACAGAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
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AAGCTAGCCA CTGGTATTTT GTTTTGTTTA AAAAAAAAAA GAAAGAAAGA AAGAAAGAAA 3000 

AACGGAAAGG AACCTAGCTG CCTGTATCTT TCATTTTTAA AATAGCACTT GAGTTATTTT 3060 

CTGAGTAATC CAATAAAGAA CTTTTGATGA CAGCCAGAAT GTGTTAGAAC TCTGGCTGAA 3120 

CATTTCATCT CCTGTGAGTC AQAA6GGCTT TATTTCTCCC TTTGATGGGG CCCCTTCTTC 3180 

TT TCTG GTGC TCTGGAAGTT GTTTAGAGGA AAGAATTCTA ATTTTAATTA ATTGCGCAGT 3240 

QAGTTAATCT CA C TC GCTT T TCTGCTTCCA GGCATCTTAG GAAAAACAAA TOQTTTTAGT 3300 

AG ATAAGGGA TGCCTACTAA TGCTTTTTTA AAACAAACAG GGACATTTTT ATTATAGATT 3360 

TGATTT TTTT AATGAATGTT TTTAAAAATA TATAAATAOG ACACCAAAGC GGCAGGGTTT 3420 

TTTTTG GCGG GAGGGGGTTT GTTTTCCAAC TCAAGATGGC ACATTAGTGG CCAGCAATAT 3480 

TTTTTAACTC ATTCCAACCA GGAAGCTTTT TTATACATTG CCTAAATCTA CGCCAACCAG 3540 

AAAATAGTCT CATCTCTTTT TTTCTCAAAT GAGATCCGTG TTTTATTTTA GCATTAAATT 3600 

AGTTACACTQ TGATQACTGO CCTA1TACCT GACTCAGCTC CCTCTACCTT GAAATTGACA 3660 

7TTTTAAAAA ATGCAACTAA GTGGTTAATA GTGTGTGACG CTCAAAGTTA ATGTAAACTG 3720 

GAAAGGTTGT GTGTCGTTGC TTTTTGTOTT TTGGTTAGGC TTGGTTTTGT TTTTTAATTT 3780 

TTATACTTTC TAATAAATTT GCAGTTTCAT TCTTTCTGTT TGTGCAAAWG GWMCTAMMU4 3840 

AAHMAAAAAC AWYWTTGGGG GGGCTTGGGC CTCGGAAAAA GTTTTTAACA CCACTTCGGG 3900 

T6G6GCGG06 GGGCCCACGT AGGTACGGOG ACCACGCGGG CCCAAA0G6G ACCCCAGAAG 3960 

GAAACCCTGG CCAAGAAAAA GGTGGCGAGA ATTCTCCACA CCAGAAAAAA ACGCGOCGGG 4020 

GGAAACCGCA GAGTGTTGCG TAAACCACAC CCGAAQAGAG AACTCAGAAG CACACAAGCQ 4080 
GQACTCAACC AGGAGGACCC AA66QAACCC GATAGAGTAC 6 



8EQ ID Mftlfl PCQ7 Protein secuencg 

Protein Accession!: none found 



1 11 21 . 31 41 51 

I I I I I I 

MWLLGPLCLL LSSAAESQLL PGNNFTNECN IPGNFMCSNG RCIPGAWQCD QLPDCFDKSD 60 

EKECPKAKSK CGPTPFPCAS GIHCIIGRFR CNGFEDCPDO SDEBNCTANP LLCSTARYHC 120 

KNGLCXDKS? ICDGQNNCQD NSDBESCESS QEPGSGQVFV TSENQLVYYP SITYAIIGSS 180 

VXFVLWALL ALVLHHQRKR KNLKTLPVHR LQHPVLLSRL WLDHPHHCN VTYNVNNGIQ 240 

YVASQAEQNA SEVGSPPSYS EALLDQRPAW YDLPPPPYSS DTESLNQADL PPYRSRSGSA 300 
NSASSQAASS LLSVEDTSHS PGQPGPQEGT ABPRDSEPSQ GTEEV 



SEQ ID NfcTO PELS DMA SEQUENCE 
Nucleic Add Accession t: NM.005658.1 

Cooing sequence: 57-1535 (undefined sequences conespond to start and stop codons) 

1 11 21 31 41 51 

I I t I I I 

GTCATATTGA ACATTCCAGA TACCTATCAT TACTCGATGC TGTTGATAAC AGCAAGATGG 60 

CTTTGAACTC AGGGTCACCA CCAGCTATTG GAOCTTACTA TGAAAACCAT GGATACCAAC 120 

CGGAAAACCC CTATCCCGCA CAGCCCACTG TGGTCCCCAC TGTCTACGAG GTGCATCCGG 180 

CTCAGTACTA CCCGTCCCCC GTGCCCCAGT ACGCCCCGAG GGTCCTGACG CAGGCTTCCA 240 

ACCCCGTCGT CTGCACGCAG CCCAAATCCC CATCCGGGAC AGTGTGCACC TCAAAGACTA 300 

AGAAAGCACT GTGCATCACC TTGACCCTGG GGACCTTCCT CGTGGGAGCT GCGCTGGCCG 360 

CTGGCCTACT CTGGAAGTTC ATGGGCAGCA AGTGCTCCAA CTCTGGGATA GAGTGCGACT 420 

CCTCAGGTAC CTGCATCAAC CCCTCTAACT GGTGTGATGG CGTGTCACAC TGCCCCGGCG 480 

GGGAGGACGA GAATCGGTGT GTTCGCCTCT ACGGACCAAA CTTCATCCTT CAGATGTACT 540 

CATCTCAGAG GAAGTOCTGG CACCCTGTGT GCCAAGACGA CTGGAACGAG AACTACGGGC 600 

GGGCGGCCTG CAGGGACATG GGCTATAAGA ATAATTTTTA CTCTAGCCAA GGAATAGTGG 660 

ATGACAGCGG ATCCACCAGC TTTATGAAAC TGAACACAAG TGCOGGCAAT GTCGATATCT 720 

ATAAAAAACT GTACCACAGT GATGCCTGTT CTTCAAAAGC AGTGGTTTCT TTACGCTGTT 780 

TAGCCTG C GG GGTCAACTTG AACTCAAGCC GCCAGAGCAG GATCGTGGGC GGTGAGAGCG 840 

CGCTCCCGGG GGCCTGGCCC TGGCAGGTCA GCCTGCACGT CCAGAACGTC CACGTGTGCG 900 

GAGGCTCCAT CATCACCCCC GAGTGGATCG TGACAGCCGC CCACTGCGTG GAAAAACCTC 960 

TTAACAATCC ATGGCATTGG ACGGCATTTG CGGGGATTTT GAGAGAATCT TTCATGTTCT 1020 

ATGGAGCCGG ATACCAAGTA CAAAAAGTGA TTTCTCATCC AAATTATGAC TCCAAGACC2A 1080 

AGAACAATGA CATTGCGCTG ATGAAGCTGC AGAAGCCTCT GACTTTCAAC GACCTAGTGA 1140 

AACCAGTGTG TCTGCCCAAC CCAGGCATGA TGCTGCAGCC AGAACAGCTC TGCT6GATTT 1200 

CCGGGTGGGG GGCCACCGAG GAGAAAGGGA AGACCTCAGA AGTGCTGAAC GCTGCCAAGG 1260 

TGCTTCTCAT TGAGACACAG AGATGCAACA GCAGATATGT CTATGACAAC CTGATCACAC 1320 

CAGCCATGAT CTGTGCCGGC TTCCTGCAGG GGAACGTCGA TTCTTGCCAG GGTGACAGTG 1380 

GAGGGCCTCT GGTCACTTCG AACAACAAXA TCTGGTGGCT GATAGGGGAT ACAAGCTGGG 1440 

GTTCTGGCTG TGCCAAAGCT TACAGACCAG GAGTGTACGG GAATGTGATG GTATTCACGG 1500 

ACTGGATTTA TCGACAAATG AAGGCAAACG GCTAATCCAC ATGGTCTTCG TCCTTGACGT 1560 

CGTTTTACAA GAAAACAATG GGGCTGGTTT TGCTTCCCCG TGCATGATTT ACTCTTAGAG 1620 

ATGATTCAGA GGTCACTTCA TTTTTATTAA ACAGTGAACT TGTCTGGCTT TGGCACTCTC 1680 

TGCCATACTG TGCAGGCTGC AGTGGCTCCC CTGCCCAGCC TGCTCTCCCT AACCCCTTGT 1740 

CCGC AAGGGG TGATGGCCGG CTG G TTGT GQ GCACTGGCGG TCAATTGTGG AAGGAAGAGG 1800 

GTTGGAGGCT GOOCCCATTG AGATCTTCCT GCTGAGTCCT TTOCAGGGGC CAATTTTGOA 1860 

TGAGCATGGA GCTGTCACTT CTCAGCTGCT GGATGACTTG AGATCAAAAA GGAGAGACAT 1920 

GGAAAGGGAG A CAGC CAGGT GGCACCTGCA GOGCCTCOCC TCTGGGGCCA CTTGGTAGTG 1980 

TCCOCAGCC7 ACTTCACAAG GGGA9TTTGC TGATGGGTTC TTAGAGCCTT AGCAGCCCTG 2040 

GATGGTGGCC AGAAATAAAG GGACCAGCOC TTCATGGGTG GTGACGTGGT AGTCACTTGT 2100 

AAGGGGAACA GAAACATTTT TGTTCTTATG GGGTGAGAAT ATAGACAGTG COCTTGGTGC 2160 
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GAGCGAAGCA ATTGAAAAGG AACTTOCCCT GAGCACTCCT GGTGCAGGTC TCCAOCTGCA 2220 

CATTGGGTGG 6GCTCCTG6G AGGGAGACTC A6CCTTCCTC CTCATCCTCC CTGACCCTCC 2280 

TCCTAGCACC CTGGAGAGTG AATGCCCCTT GGTC C CTSGC AGGGCGCCAA GTTTGGCACC 2340 

ATCTCGGCCT CTTGAGGCCT GATAGTCATT OGAAATTGAG GTCCATGGGG GAAATGAAGG 2400 

ATGCTCAGTT TAAGGTACAC TGTTTCCATG TTATGTTTCT ACACATTGAT GGTGGTGACC 2460 
CTGAGTTCAA AGCCATCTT 



Protein Accession #: NP.00S647.1 

1 IX 21 31 41 51 

I I I I I I 

HALNSGSPPA IGPYYENHGY QPENPYPAQP TWPTVYEVH PAQYYPSPVP QYAFKVLTQA 60 

SNPWCTQPK SPSGTVCTSK TKKALCITLT LGTFLVGAAL AAGLLWKFMG SKCSNSQIBC 120 

DSSGTCINPS NWCDGVSHCP GGEDENRCVR LYGFNFILQM YSSQRKSWHP VCQDEWNENY 180 

GRAACRDMGY KNNPYSSQGI VDDSGSTSFM KLNTSAGNVD XYKKLYHSDA CSSKAWSLR 240 

CLACGVNLNS SRQSRIVGGE SALPGAWPWQ VSLHVQNVHV CGGSIITPEW IVTAAHCVEK 300 

PLNNPWHWTA PAGILRQSFM FYOAGYQVQK VISHPNYDSK TKNNDIALMK LQKPLTFNDL 360 

VKFVCLPNPG MMLQPEQLCW ISGWGATEEK GKTSEVLNAA KVLLIETQRC NSRYVYDNLI 420 

TPAMICAGFL QGNVDSCQGD SGGPLVTSNM NIWWLIGOTS WGSGCAKAYR PGVYGNVMVP 480 
TDWIYRQMKA NG 



SEQ D N0:174 PBJ4 DNA SEQUENCE 

NudeteAckJ Accession*: AK94767 



Coding sequence: 130-1 086 (underfined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CAGAGAGGCT GTATTTCAGT GGAGCCTGCC AGACCTCTTC TGGAGGAAGA CTGGACAAAG 60 

GGGGTCACAC ATTCCTTCCA TACGGTTGAG OCTCTACCTG CCTGGTGCTG GTCACAGTTC 120 
AGCTTCTT CA TGA TGGTQGA TCCCAATGGC AATGAATOCA GTGCTACATA CTTCATCCTA 180 

AT AGGCC TCC CTGGTTTAGA AGAGGCTCAG TTCTGGTTGG CCTTCCCATT GTGCTCCCTC 240 

TACCTTATTG CTGTGCTAGG TAACTTGACA ATCATCTACA TTGTGCGGAC TGAGCACAGC 300 

CTGCATGAGC CCATGTATAT ATTTCTTTGC ATGCTTTCAG GCATTGACAT CCTCATCTCC 360 

ACCTCATCCA TGCCCAAAAT GCTGGCCATC TTCTGGTTCA ATTCCACTAC CATCCAGTTT 420 

GATGCTTGTC TGCTACAGAT GTTTGCGATC CACTCCTTAT CTGGCATGGA ATCCACAGTG 480 

CTGCTGGCCA TGGCTTTTGA CCGCTATGTG GGCATCTGTC ACCCACTGCG CCATGCCACA 540 

GTACTTACGT TGCCTCGTGT CACCAAAATT GGTGTGGCTG CTGTGGTGCG GGGGGCTGCA 600 

CTGATGGCAC OOCTTO CT OT CTTCATCAAO CAGCTGCCCT TCTGCCGCTC CAATATCCTT 660 

TCOCATTCCT ACTGCCTACA CCAAGATGTC ATGAAGCTGG CCTGTGATGA TATCCGGGTC 720 

AATGTCGTCT ATGGCCTTAT CGTCATCATC TCCGCCATTG GCCTGGACTC ACTTCTCATC 780 

TCCTTCTCAT ATCTGCTTAT TCTTAAGACT GTGTTGGGCT TGACACGTGA AGCCCAGGGC 840 

AAGGCATTTG GCACTTGCGT CTCTCATGTG TGTGCTGTGT TCATATTCTA TGTACCTTTC 900 

ATTGGATTGT CCATGGTGCA TOGCTTTAGC AAGCGGCGTG ACTCTCCACT GCCCGTCATC 960 

TTGGCCAATA TCTATCTGCT GGTTCCTCCT GTGCTCAACC CAATTGTCTA TGGAGTGAAG 1020 

ACAAAGGAGA TTCGACAGCG CATCCTTCGA CTTTTCCATG TGGCCACACA CGCTTCAGAG 1080 

CCCTAGGTGT CAGTGATCAA ACTTCTTTTC CATTCAGAGT OCTCTGATTC AGATTTTAAT 1140 

GTTAACATTT TGGAAGACAG TATTCAGAAA AAAAATTTCC TTAATAAAAA TACAACTCAG 1200 

ATCCTT CAAA TATGAAACTG GTTGGGGAAT CTCCATTTTT TCAATATTAT . TOlU'ltartfl 1 1260 

OrriWATGC TACATATAAT TATTAA7ACC CTGACTAGGT TGTGGTTGGA GGGTTATTAC 1320 

TTTTCATTTT ACCATGCAGT CCAAATCTAA ACTGCTTCTA CTGATGGTTT ACAGCATTCT 1380 

GAGATAAGAA TGGTACATCT AGAGAACATT TGCCAAAGGC CTAAGCACAG CAAAGGAAAA 1440 

TAAACACAGA ATATAATAAA ATGAGAIAAT CTAGCTTAAA ACTATAACTT CCTCTTCAGA 1500 

ACTOCCAAGC ACATTGGATC TCAGAAAAAT ACTGTCTTCA AAATGACTTC TACAGAGAAG 1560 

AAATAATTTT TOCTCTGGAC ACTAGCACTT AAGGGGAAGA TTGGAAGTAA AGCCTTGAAA 1620 

AGAGTACATT TACCTACGTT AATGAAAGTT GACACACTGT TCTGAGAGTT TTCACAGCAT 1680 

ATGGACCCTG TTTTTCCTAT TTAATTTTCT TATCAAOCCT TTAATTAGGC AAAGATATTA 1740 
TTAGTACCCT CATTGTAGCC ATGGGAAAAT TGATGTTCAO TGGGQATCAG TGAATTAAAT ~1800 

GGGGTCATAC AAGTATAAAA ATTAAAAAAA AAAGACTICA TGCCCAATCT CATATGATGT 1860 

GGAAGAACTG TTAAAGAGAC CAACAGGGTA GTGGGTTAGA GATTTCCAGA GTCTTACATT 1920 

TTCTARAGGA GGTATTTAAT TTCTTCTCAC TCATCCAGTG TTGTATTTAG GAATTTCCTG 1980 

GCAACAQAAC TCATGGCTTT AATCCCACTA GCTATTGCTT ATTGTCCTGG TCCAATTGCC 2040 

AATTACCTGT GTCTTGGAAG AAGTGATTTC TAGGTTCACC ATTATGGAAG ATTCTTATTC 2100 

AGAAAGTCTG CATAGGGCTT ATAGCAAGTT ATTTATTTTT AAAAGTTCCA TAGGTGTTTC 2160 

TGATAGGCAG TGAGGTTAGG GAGOCACCAG TTATGATGGG AAGTATGGAA TGGCAGGTGT 2220 

TGAAGAXAAC ATTGGOCTTT TGAGTGTGAC TCGTAGCTGG AAAGTGAGGG AATCTTCAGG 2280 

ACCATGCTTT ATTTGGGGCT TTGTGCAGTA TGGAACAGGG ACTTTGAGAC OGGGAAAGCA 2340 

ATCTGACTTA GGCATGGGAA TCAGGCATTT TTGCTTCTGA GGGGCTATTA CCAAGGGTTA 2400 

ATAGGTTTCA TCTTCAACAG GATATGACAA CAGTCTTAAC CAAGAAACTC AAATTACATA 2460 

TACTAAAACA TGTGATCATA TATGTGGTAA GTTTCATTTT CTTTTTCAAT CCTCAGGTIC 2520 

CCTGASATGG ATTCCTAOMA CATGCTTTCA TCCCCTTTTG TAATGGATAT CATATTTCGA 2580 

AATGCCTATT TAATOCTTGT ATITGCTGCT GGACTGTAAG CCCATGAGGG CACTGTTTAT 2640 

TATTGAATGT CATCTCTGTT CATCATTGAC TGCTCTTTGC TCATCATTGA ATOCCCCAGC 2700 

AAAGTGCCTA GAACATAATA GTGCTTATGC TT6ACAOCGG TTATTTTTCA TCAAACCTGA 2760 

TTCCTTCTGT GCTGAACACA TAGCCAGGCA ATTTTCCAGC CTTCTTTGAG TTGGGTATTA 2820 

TTAAATTTTA GCCATTACTT CCAA TGTGAO TGGAAGTGAC ATGTGCAATT TTTATACCTG 2880 

GCTCATAAAA CCCTCCCATG TGCAGOCTTT CATGTTGACA TTAAATGTGA CPTGGGAAGC 2940 
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TATGTCTTAC ACAGAGTTAA TTAACCNGAA AGGCCTGGNA A1TTTTTGNN AANNAAACTG 3000 
TGGCCNNGAG GCCCNCAACC CTTTTTNNNA ATTTGGCAAN NTCCCACTTT GTANTTTGGT 3060 
AAGGAGGCCA GTTGGATAAG TGAAAAATAA AGTACTATTG TGTC 



SEQ D NO:175 PBJ4 PROTHN SEQUENCE 
Protein Accession ft not available, cioned at Eos 



1 11 21 31 41 51 

I I I I II 

MVDPNGNESS ATYFILIGLP GLEEAQFWLA FPLCSLYLIA VUGNLTIIYI VRTSHSLHE? 60 
HYTFLCHLSO IDILISTSSM PKMLAIFWFN STTIQFDACL LQMFAIHSLS GMESTVLLAM 120 
AFDRYVAICH PLRRATVLTL PRVTKIGYAA WRGAALMAP LPVFIKQLPF CRSNILSHSY 180 
CLHQDVMKLA CDDIRVNWY GLIVIISAIG LDSLLISFSY LLILKTVLGL TRBAQAKAFG 240 
TCVSHVCAVF IPYVPFIGLS MVHRFSKRRD SPLFVILANI YLLVPFVLNP IVYGVKTKEI 300 
RQRILRLFHV ATHA5BP 

SEQ ID NO:176 PM72 DMA SEQUENCE 
Nudete Add Accession ft NM_004624.1 

Coding sequence: 67-1 544 (underlined sequences correspond to start and stop codons) 



TCGGAGCCTG CGGAGGGTGG TGGTGGTGGT GGTGGTGGCC CTCGCCCGCC TCACTCATGC 60 

CTCCTCCTCC TCTGCTCTCG CTCAGGCGCC TCGGTGGCGG TTGGTCGGCG GTTACGCGGC 120 

TGGTGGTCGC GGCGGCCGGG GCTCGCTCTC GGGGAGGCCG GGGCGGATCT CGCGGCGCAG 180 

GCGGCGGCGO CCGAGGTGGG GTCGCGCGGC GGAGGCGGCT CGAGCTTCGT GCTGCGCGCT 240 

CGCTCTTGGQ CTCCTCGCTG CAGGAGGAGT GTGACTATGT GCAGATGATC GAGGTGCAGC 300 

ACAAGCAGTG CCTGGAGGAG GCCCAGCTGG AGAATGAGAC AATAGGCTGC AGCAAGATGT 360 

GGGACAACCT CACCTGCTGG CCAGCCACCC CTCGGGGCCA GGTAGTTGTC TTGGCCTGTC 420 

CCCTCATCTT CAAGCTCTTC TCCTCCATTC AAGGCCGCAA TGTAAGCCGC AGCTGCACCG 480 

ACGAAGGCTG GACGCACCTG GAGCCTGGOC CGTACCCCAT TGCCTGTGGT TTGGATGACA 540 

AGGCAGCGAG TTTGGATGAG CAGCAGACCA TGTTCTACGG TTCTGTGAAG ACCGGCTACA 600 

CCATTGGCTA CGGCCTGTCC CTCGCCACCC TTCTGGTCGC CACAGCTATC CTGAGCCTGT 660 

TCAGGAAGCT CCACTGCACG CGGAACTACA TCCACATGCA CCTCTTCATA TCCTTCATCC 720 

TGAGGGCTGC CGCTGTCTTC ATCAAAGACT TGGCCCTCTT CGACAGCGGO 6AGTCGGACC 780 

AGTGCTCCGA GGGCTCGGTG GGCTGTAAGG CAGCCATGGT CTTTTTCCAA TATTGTGTCA 840 

-TGGCTAACTT CTTCTGGCTG CTGGTGGAGG GCCTCTACCT GTACACCCT6 CTTGCCGTCT 900 

CCTTCTTCTC TGAGCGGAAQ TACTTCTGGG GGTACATACT CATCGGCTGG GGGGTACCCA 960 

GCACATTCAC CATGGTGTGG ACCATCGCCA GGATCCATTT TGAGGATTAT GGTCTGCTCA 1020 

GGTGCTGGGA CACCATCAAC TCCTCACTGT GGTGGATCAT AAAGGGCCCC ATCCTCACCT 1080 

CCATCTTGGT AAACTTCATC CTGTTTATTT GCATCATCCG AATCCTGCTT CAGAAACTGC 1140 

GGCCCCCAGA TATCAGGAAG AGTGACAGCA GTCCATACTC AAGGCTAGCC AGGTCCACAC 1200 

TCCTGCTGAT CCCCCTGTTT GGAGTACACT ACATCATGTT CGCCTTCTTT CCGGACAATT 1260 

TTAAGCCTGA AGTGAAGATG GTCTTTGAGC TCGTCGTGGG GTCTTTCCAG GGTTTTGTGG 1320 

TGGCTATCCT CTACTGCTTC CTCAATGGTG AGGTGCAGGC GGA6CTGAGG CGGAAGTGGC 1380 

GGCGCTGGCA CCTGCAGGGC GTCCTGGGCT GGAACCCCAA ATACCGGCAC CCGTCGGGAG 1440 

GCAGCAACGG CGCCACGTGC AGCACGCAGG TTTCCATGCT GACCCGCGTC AGCCCAGGTG 1500 

CCCGCCGCTC CTCCAGCTTC CAAGCCGAAG TCTCCCTGGT CTGACCACCA GGATCCCAGC 1560 

CCAAGCGGCC CCTCCCGCCC CTTCCCACTC GCAGCAGACG COGGGGACAG AGGCCTGCCC 1620 

GGGCGCGCCA GCCCCGGCCC TGGGCTCGGA GGCTGCCCCC GGCCCCCTGG TCTCTGGTCC 1680 

GGACACTCCT AGAGAACGCA GCCCTAGAGC CTGCCTGGAG CGTTTCTAGC AAGTGAGAGA 1740 

GATGGGAGCT CCTCTCCTGG AGGATGCAGG TGGAACTCAG TCATTAGACT CCTCCTCCAA 1800 

AGGCCCCCTA CGCCAATCAA GGGCAAAAAG TCTACATACT TTCATCCTGA CTCTGCCCCC 1860 

TGCTGGCTCT TCTGCCCAAT TGGAGGAAAG CAACCGGTGG ATCCTCAAAC AACACTGGTG 1920 

TGACCTGAGG GCAGAAAGGT TCTGCCCGGG AAGGTCACCA GCACCAACAC CACGGTAGTG 1980 

CCTGAAATTT CACCATTGCT GTCAAGTTCC TTTGGGTTAA GCATTACCAC TCAGGCATTT 2040 

GACTGAAGAT GCAGCTCACT ACCCTATTCT CTCTTTACGC TTAGTTATCA GCTTTTTAAA 2100 

GTGGGTTATT CTGGAGTTTT TGTTTGGAGA GCACACCTAT CTTAGTGGTT CCCCACCGAA 2160 

GTGGACTGGC CCCTGGGTCA GTCTGGTGGG AGGACGGTGC AACCCAAGGA CTGAGGGACT 2220 

CTGAAGCCTC TGGGAAATGA GAAGGCAGCC ACCAGCGAAT GCTAGGTCTC GGACTAAGCC 2280 

TACCTGCTCT CCAAGTCTCA GTGGCTTCAT CTGTCAAGTG GGACTCTGTC ACACCAGCCA 2340 

TTCTTATCTC TCTGTGCTGT GGAAGCAACA GGAATCAAGA GACTGCCCTC CTTGTCCACC 2400 

CACCTATGTG CCAACTGTTG TAACTAGGCT CAGAGATGTG CACCCATGGG CTCTGACAGA 2460 

AAGCAGATCC TCACCCTGCT ACACATACAG GATTTGAACT CAGATCTGTC TGATAGGAAT 2520 

GTGAAAGCAC GGACTCTTAC TGCTAACTTT TGTGTATCGT AACCAGCCAG ATCCTCTTGG 2580 

TTATTTGTTT ACCACTTGTA TTATTAATGC CATTATCCCT GAATTCCCCT TGCCACCCCA 2640 

CCCTCCCTGG AGTGTGGCTG AGGAGGCCTC CATCTCATGT ATCATCTGGA TAGGAGCCTG 2700 

CTGGTCACAG CCTCCTCTGT CTGCCCTTCA CCCCAGTGGC CACTCAGCTT CCTACCCACA 2760 

CCTCTGCCAG AAGATCCCCT CAGGACTGCA ACAGGCTTGT GCAACAATAA ATGTTGGCTT 2820 
GGAAAAAAAA AAAA 

SEQ ID NO:177 WZWdnSSEBEBZ 

Protein Accession ft JC2195 

1 11 21 31 41 51 

I I I I I I 

HPPPFLLSLR RLGGGWSAVT RLWAAAGAR SRGGRGGSRG AGGGGRGGVA RRRRLBLRAA 60 

RSLLGSSLQB BCDYVQUIEV QHKQCLEEAQ LENETIGCSK MWDNLTCWPA TPRGQVWLA 120 
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5 
10 



CPLIFKLFSS IQGRNVSRSC TDEGWTHLEP GPYPIACGLD DKAASLDEQQ TMFYGSVKTG 180 

YTIGYGLSLA TLLVATAILS LFRKLHCTRN YTHMHT.FISP ILRAAAVFIK DLALFDSGES 240 

DQCSEGSVGC KAAHVFFQYC VMANFFWLLV EGLYLYTLLA VSFFSERKYF WGYILIGWGV 300 

PSTFTMVWri ARZHFEDYGL LRCWDTINSS LWWIIKGPIL TSILVNFILP ICIIRILLQK 360 

LRPPDIRKSD SSPYSRLARS TLLLIPLFGV HVTMFAFFPD NFKPEVKMVP ELWGSFQGP 420 

WAILYCFLN GEVQAELRRK WRKWHLQGVL GWNPKYRHPS GGSNGATCST QVSMLTRVSP 480 
GARRSSSFQA EVSLV 



Nucleic Acid Accession*: 

Coding sequence: 



SEQ ID NO-.178 BFF8 ONA SEQUENCE 

AL133619 

1-2070 (nxienTned sequences correspond to start and stop codons) 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



1 11 21 31 

I I I I 

ATGAGCGGTQ CGGGGGTGGC GGCTGGGACG CGGCCCCCCA 
CGGCGCCGGC GCCAGCGCCC CTCTGTGGGC GTCCAGTCCT 
CTCAGGCAGA GCGACCCGCA GAAACGGAAC CTGGACCTGG 
CAGCAGCAGC ACTCGGAGAT GCTGGCCAAG CTCCATGAGG 
GAAAACAAGG GTGAGCCGGC OCGGGGCCCT AGGCCGGCCC 
ACACTGCCGC TCCCGCAGCA CAGAAACACA GCCATCAACT 
GGGGGAACAC AGGACGGGGA GCCCCTCCAG ACTGTCCTTG 
CCTGTATGCC AACCCAGTGG QTACAGGTTC TGOGGGACCT 
AGCC6TGGCT GGACGATGTT ATGCAGCCAA GCACAGCACG 
GGGCCTGAGG TCATTGCAGG GCGGCAGGTG GCCACAGGGT 
CCAAGTAGAG CTGAAATGGG AAGGAACCCC TGGGACAGCC 
CCTCAGATTG CTGCTGTGGC CAGGCCCAGG ATTTCCAGCC 
ATGCTGGGGG CCCAGGGGAT ATGGACACAC TCCATCCAGG 
GCAGCAACCA TGGGGACAAA GGGAGGAAGC AGAGTCCTGT 
GCACTTCCCC ATCCTGACAG CGGCCCCCAC CCAGCCCAGG 
GCTCACTTCC CATTATCTTT GGGGCTGGGG CTGACATCAG 
TGGAGCCAGC CTGGGAACAT CGCAGCTGGG GCAGTGCCTA 
GACATGGAGA AGGGGGTTGA GGGAGG6CCC TTCCCTAGCC 
CTGTTCTGGG CAAAGTGTGG CCCAAGTCGG CAGCCCCAGC 
GACAGGACAC GGGAAGAGGC CATGCTTTCC CTCGGGACCT 
CCCTCCTGCT TTCCAGATGG CCCCTCAGGA AACCACCTTT 
GGCGCTCGCT GGGTCTGCAT CAACGGAGTG TGGGTAGAGC 
AGGCTGAAGG AGGGCTCCTC ACGGACACAC AGGCCAGGAG 
GGCGGTAGCG CCGACACTGT GOGCTCTCCT GCAGACAGCC 
TCTGTCAAGT CCATCTCTAA TTCAGCCAAC TCTCAAGGCA 
TCCTTCAACA AGCAAGATTC AAAAGCTGAC GTCTCCCAGA 
CCCCTACTTC ACAACAGCAA GCTGGACAAA GTTCCTGGGG 
GAGAAAGCAG AGGCCTCTAA TGCAGGAGCT GCCTGTATGG 
AGGCAGATGG GGGCGGGGGC ACACCCCCCA ATGATCCTGC 
ACCACACTTA GGCAGTGCGA AGTGCTCATC CGCGAGCTGT 
ACCCAAGAGC TGCGGCAOCT CAAGTCCCTC CTGGAAGGGA 
CCGGAGGAAG CTAGCTTTCC CAGGGACCAA GAAGCCACGC 
AAGAGCCTCT CCAAGAAATG CCTGAGCCCA CCTGTGGCGG 
CTGAAGCAGA CCCCGAAGAA CAACTTTGCC GAGAGGCAGA 
AAACOGCGOC TGCATCGCTC AGTGCTTXQA 



41 
I 

GCTCGCCGAC 
TGAGGCCGCA 
AGAAAAGCCT 
AGATCGAGCA 
TGCCTCCCCA 
CCAGCACACG 
CCCACCTGGC 
GGACAGATGC 
TGCTGCTCTC 
GCTCCCCAGA 
CCTGCCCTGC 
CTATGGCTCT 
GATCCCTTCC 
TTOCTTGCCA 
ATCCTGGGCT 
GAGGACATCT 
GGGCTCTCCC 
GCTGTGGCAA 
CCTGCAGTGC 
GCTGTTCCAT 
CCAGCJGCCTC 
CGGGAGGACC 
GCAAGCGTGG 
TCTCCATGTC 
AGGCCAGGCC 
AGGCGGACCT 
TACAAGGGCA 
GGAACAGCCA 
CCCTTCCCCT 
GGAATACCAA 
GCCAGAGGCC 
ATTTCCCCAA 
AGCGTGCCAT 
AGAGGCTGCA 



51 
I 

CCCGGGCTCT 
GAGCCCGCAG 
GCAGTTCCTG 
TCTGAAGCGG 
GGCACACTCA 
CCTGGGCTCA 
TGCACTGGCC 
CGCTACCTCT 
GGOAAGCCCA 
CCTCCCTCCT 
TAGATCTTTG 
GAGTCCTCAC 
TGCCATCTGG 
CTTCTCCAAG 
GTGGTCTCAA 



TTCOCAGGGA 
CTCCAGTGAG 
TGGGGACGCT 
GTGTCCCAAG 
TGCTCCCTTG 
CAGCCCTGCC 
GCGTCTTGCG 
AAGCTTCCAG 
CGAGCCCGGC 
GGAAGAGGAG 
GGCCAGAAAG 
GCACCAGGGC 
GCGAAAGCCC 
CCTCCTGCAG 
CCAGGCAGCC 
GGTCTCCACC 
CCTGCOCGCA 
GGCAATGCAG 



SEQPHO:179 E 
Protein Access i on ft 



l 
I 

MSGAGVAAGT 
QQQHSEHLAK 
GGTQDGEPLQ 
GPEVTAGRQV 
MLGAQGIWTH 
AHFPLSLGLG 
LFWAKCGPSR 
GARWVCINGV 
SVKSISMSAN 
EKAEASNAGA 
TQELRHLKSL 
LKQTPKNNFA 



11 21 

I I 
RPPSSPTPGS RRRRQRPSVG 



TVLAHLAALA 
ATGCSPDLPP 
SIQGSLPAIW 
LTSGGHLTGG 
QPQPCSAGDA 
WVEPGGPSPA 
SQGKARPQPG 



PVCQPSGYRF 



AATWGTKGGS 
WSQPGNIAAG 
DRTREEAMLS 



LEGSQRPQAA 
ERQKRLQAMQ 



SPNKQDSKAD 
RQMGAGAHPP 
PEEASFPRDQ 
KRRLHRSVL 



T43457 



31 
I 

VQSLRPQSPQ 
RPALPPQAHS 
WGTWTDAATS 
WDSPCPARSL 
RVLFPCHLSK 
AVPRALPSQG 
LGTCCSKCPK 
RPGGKRGRLA 
VSQKADLEEE 
MILPLPLRKP 
EATHFPKVST 



TLPLPQHRNT 
SRGWEHLCSQ 
PQIAAVARPR 
ALPHPDSGPH 
DMEKGVEGGP 
PSCPPDGPSG 
GGSADTVRSP 
PLLHNSKLDK 
TTLRQCEVLI 
KSLSKKCLSP 



51 
I 

LDLBKSLQFL 



AQHVLLSGSP 
ISSPMALSPH 
PAQDPGLWSQ 



NHLSRASAPL 



VPGVQGQARK 
RELWNTNLLQ 
PVAERAILPA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



NudefcAca Accession*: 



SEQ ID NO:180 BCR4 DNA SEQUENCE 
NM-01231O2 



Goting sequence: 



138-2405 (undefined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CTCGTGCCGA ATTCGGCACG AGACCGCGTG TTCGCGCCTG GTAOAGATTT CTCGAAGACA 
CCAGTGGGCC CGTGTGGAAC CAAACCTGCG CGCGTGGCCG GGCCGTGGGA CAACGAGGCC 



60 
120 
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GCGGAGACGA AGGCGCAATQ OCGAGGAAGT TATCTOTAAT CTTGATCCTQ ACCTTTGCCC 180 

TCTCTGTCAC AAATCCCCTT CATGAACTAA AAGCAGCTGC TTTCCCCCAO ACCACTQAGA 240 

AAATTAGTCC GAATTGOGAA TCTGGCATTA ATCTTCACTT GGCAATTTCC ACAC6GCAAT 300 

ATCATCTACA ACAGCTTTTC TACCGCTATG GAGAAAATAA TTCTTTGTCA GTTGAAGGGT 360 

TCAGAAAATT ACTTCAAAAT ATAGGCATAG ATAAGATTAA AAGAATCCAT ATACACCAT6 420 

ACCACGACCA TCACTCAGAC CACGAGCATC ACTCAGACCA TGAGCGTCAC TCAGACCATG 480 

AGCATCACTC AGACCACGAG CATCACTCTG ACCATGATCA TCACTCTCAC CATAATCATG S40 

CTGCTTCTGG TAAAAATAAG CGAAAAGCTC TTTGCCCAGA CCATGACTCA GATAGTTCAG 600 

GTAAAGATCC TAGAAACAGC CAGGGGAAAG GAGCTCACCG ACCAGAACAT GCCAGTGGTA 660 

GAAGGAATGT CAAGGACAGT GTTAGTGCTA GTGAAGTGAC CTCAACTGTG TACAACACTG 720 

TCTCTGAAGG AACTCACTTT CTAGAGACAA TAGAGACTOC AAGACCTGGA AAA CT C T TCC 780 

CCAAAGATGT AAGCAGCTCC ACTCCACCCA GTGTCACATC AAAGAGCCGG GTGAGCCGGC 840 

TGGCTGGTAG GAAAACAAAT GAATCTGTGA GTGAGCCCCG AAAAGGCTTT ATGTATTCCA 900 

GAAACACAAA TGAAAATCCT CAGGAGTOTT TCAATGCATC AAAGCTACTG ACATCTCATG 960 

GCATGGGCAT CCAGGTTCCG CTGAATGCAA CAGAGTTCAA CT A TCTCTGT CCAGCCATCA 1020 

TCAACCAAAT TGATGCTAGA TCTTGTCTGA TTCATACAAG TGAAAAGAAG GCTGAAATCC 1080 

CTCCAAAGAC CTATTCATTA CAAATAGCCT GGGTTGGTGG TTTTATAGCC ATTTCCATCA 1140 

TCAGTTTCCT GTCTCTGCTG GGGGTTATCT TAGTGCCTCT CATGAATCGG GTGTTTTTCA 1200 

AATT TCTCC T GAGTTTCCTT GTGGCACTGG CCGTTGGGAC TTTGAGTGGT GATGCTTTTT 1260 

TACACCTTCT TCCACATTCT CATGCAAGTC ACCACCATAG TCATAGCCAT GAAGAACCAG 1320 

CAATGGAAAT GAAAAGAGGA CCACTTTTCA GTCATCTGTC TTCTCAAAAC ATAGAAGAAA 1380 

GTGOCTATTT TGATTCCACG TGGAAGGGTC TAACAGCTCT AGGAGGCCTG TATTTCATGT 1440 

TTCTTGTTGA ACATGTCCTC ACATTGATCA AACAATTTAA AGATAAGAAG AAAAAGAATC 1500 

AGAAG AAAC C TGAAAATGAT GATGATGTGG AGATTAAGAA GCAGTTGTOC AAGTATGAAT 1560 

CTCAACTTTC AACAAATGAG GAGAAAGTAG ATACAGATGA TCGAACTGAA GGCTATTTAC 1620 

GAGCAGACTC ACAA6AGC0C TCCCACTTTG ATTCTCAGCA GCCTGCAGTC TTGGAAGAAG 1680 

AAGAGGTCAT GATAGCTCAT GCTCATCCAC AGGAAGTCTA CAATGAATAT GTACGCAGAG 1740 

GGTGCAAGAA TAAATGCCAT TCACATTTCC ACGATACACT CGGCCAGTCA GACGATCTCA 1800 

TTCACCACCA TCATGACTAC CATCATATTC TCCATCATCA CCACCACCAA AACCACCATC 1860 

CTCACAGTCA CAGCCAGCGC TACTCTCGGG AGGAGCTGAA AGATGCCGGC GTCGCCACTT 1920 

TGGCCTGGAT GGTGATAATG GGTGATGGGC TGCACAATTT CAGCGATGGC CTAGCAATTG 1980 

GTGCTGCTTT TACTGAAGGC TTATCAAGTG GTTTAAGTAC TTCTGTTGCT GTGTTCTCTC 2040 

ATGAGTTGCC TCATGAATTA GGTGACTTTG CTGTTCTACT AAAGGCTGGC ATGACCGTTA 2100 

AGCAGGCTGT CCTTTATAAT GCATTGTCAG CCATGCTGGC GTATCTTGGA ATGGGAACAG 2160 

GAATTTTCAT TGGTCATTAT GCTGAAAATG TTTCTATGTG GATATTTGCA CTTACTGCTG 2220 

GCTTATTCAT GTATGTTGCT CTGGTTGATA TGGTACCTGA AATGCTGCAC AATGATGCTA 2280 

GTGACCATGG ATGTAGCCGC TGGGGGTATT TCTTTTTACA GAATGCTGGG ATGCTTTTGG 2340 

GTTTTGGAAT TATGTTACTT ATTTCCATAT TTGAACATAA AATCGTGTTT CGTATAAATT 2400 

T CTAGT TAAG GTTTAAATGC TAGAGTAGCT TAAAAAGTTG TCATAGTTTC AGTAGGTCAT 2460 

AGGGAGATGA GTTTGTATGC TGTACTATGC AGOGTTTAAA GTTAGTGGGT TTTGTGATTT 2520 

TTGTATTGAA TATTGCTGTC TGTTACAAAG TCAGTTAAAG GTACGTTTTA ATATTTAAGT 2580 

TATTCTATCT TGGAGATAAA ATCTGTATGT GCAATTCACC GGTATTACCA GTTTATTATG 2640 

TAAAC AAGAG ATTTGGCATG ACATGTTCTG TATGTTTCAG GGAAAAATGT CTTTAATGCT 2700 

TTTTCAAGAA CTAACACAGT TATTCCTATA CTGGATTTTA GGTCTCTGAA GAACTGCTGG 2760 

TGTTTAGGAA TAAGAATGTG CATGAAGCCT AAAATACCAA GAAAGCTTAT ACTGAATTTA 2820 

AGCAAAGAAA TAAAGGAGAA AAGAGAAGAA TCTGAGAATT GGGGAGGCAT AGATTCTTAT 2860 

AAAAATCACA AAATTTGTTG TAAATTAGAG GGGAGAAATT TAGAATTAAG TATAAAAAGG 2940 

CAGAATTAGT ATAGAGTACA TTCATTAAAC ATTTTTGTCA GGATTATTTC CCGTAAAAAC 3000 

GTAGTGAGCA CTCTCATATA CTAATTAGTG TACATTTAAC TTTGTATAAT ACAGAAATCT 3060 

AAATATATTT AATGAATTCA AGCAATATAC ACTTGACCAA GAAATTGGAA TTTCAAAATG 3120 

TTCGTGCGGG TTATATACCA GATGAGTACA GTGAGTAGTT TATGIATCAC CAGACTGGGT 3180 

TATTGCCAAG TTATATATCA CCAAAAGCTG TATGACTGGA TGTTCTGGTT ACCTGGTTTA 3240 

CAAAATTATC AGAGTAGTAA AACTTTGATA TATATGAGGA TATTAAAACT ACACTAAGTA 3300 

TCATTTGATT CGATTCAGAA AGTACTTTGA TATCTCTCAG TGCTTCAGTG CTATCATTGT 3360 

GAGCAATTGT CTTTATATAC GGTACTGTAG CCATACTAGG CCTGTCTGTG GCATTCTCTA 3420 
GATGTTTCTT TTTTACACAA TAAATTCCTT ATAICAGCTT G 

SEQ ID N0:181 BCR4 PROTON SEQUENCE 
Praieln Accession!: NP.036451 

1 11 21 31 41 51 

I I I I I I 

MARKLSVILI LTFALSVTNP LHELKAAAFP QTTEKISPNW BSGINVDLAI STRQYHLQQL 60 

FYRYGENNSL SVBGFRKLLQ NIGIDKIKRI HIHHDHDHHS DHEHHSDHER HSDHEHHSDH 120 

EHHSDHDHHS HHNHAASGKN KRKALCPDHD SDSSGKDFBN SQGKGAHRPE HASGRRNVKD 180 

SVSASEVTST VYNTVSEGTH FLETIETPRP GKLPPKDVSS STPPSVTSKS RVSRLAGRKT 240 

NBSVSBPRKG FMYSRNTNEN PQECFNASKL LTSHGMGIQV PLNATEFNYL CPA1IKQ1DA 300 

RSCLIHTSEK KAEIPPKTYS LQIAWVGGPI AISIISFLSL LGVILVPLMN KVFPKPLLSP 360 

LVALAVGTLS GDAFLHLLPH SHASHHHSES HEEPAMEHKR GPLFSHLSSQ NIEESAYFDS 420 

TWKGLTALGG LYFMPLVEHV I/TLIRQPKDR KKKNQKKPEN DDDVEIKKQL SKYESQLST3J 480 

EEKVDTDDRT EGYL RADSQB PSHFDSQQPA VLBEEEVMIA HAHPQBVYNE YVPRGCKNKC 540 

HSHPHDTLGQ SDDL I HHHH D YHHTLHHHHH QNHHPHSHSQ RYSREELKDA GVATLAWMVT 600 

MGDGLHNFSD GLAIGAAPTE GLSSGLSTSV AVFCHELPHB LGDFAVLLKA GMTVKOAVLY 660 

KALSAMLAYL GMATGIFZGH YAEWSKWIF ALTAGLPMYV ALVDMVPEML HMDASDHGCS 720 
RWGYFPLQNA GMLLGPGIML LISIP KHK1V FRINP 



375 
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SEQPNftittpcrcPNArawnce 

Nudete Add Accession #: NM.0012Q3 

Coding sapience: 274-1 782 (imderfined sequences correspond to start and stop codons) 

5 I II 21 31 41 51 
I I I I I I 

CGCGGGGCGC GGAGTCGGCG GGGCCTOGCG GGACGCGGOC AGTGCGGAGA CCGCGGCGCT 60 
GAGGACGCGO OAOCCGGGAG OGCACGCGCG GGGTGG AGTT CAGCCTACTC TTTCTTAGAT 12) 
OTGAAAGGAA AGGAAGATCA TTTCATGCCT TGTTGATAAA GGTTCAGACT TCTGCTGATT 180 

10 CATAACCATT TGGCTCTG AG CTATGACAAG AGAGG AAACA AAAAGTTAAA CTTACAAGOC 240 
TGCCATAAGT GAGAAGCAAA CTTCCTTGAT AACATGCTTT TGCG AAGTGC AGGAAAATTA 300 
AATGTGGGCA CCAAGAAAG A GGATGGTG AG AGTACAGCCC CCAOCCOOOG TCCAAAGGTC 360 
TTOCOTTGTA AATGCCACCA CCATTGTOCA GAAGACTCAG TCAACAATAT TTGCAGCACA 420 
GACGGATATT GTTICAOGAT GATAGAAGAG G ATG ACTCTG GGTTGOCTGT GGTCACTTCT 480 

15 GGTTGCCTAG GACTAGAAGG CTCAGATTTT CAGTGTCGGG ACACTCCCAT TCCTCATCAA 540 
AG AAG ATCAA TTGAATGCTG CAC AGAA AGG AACGAATGTA ATAAAGACCT ACACOCTACA 600 
CTGCCTOCAT TGAAAAACAG AGATTTTGTT GATGGACCTA TACACCACAG GGCTTTACTT 660 
ATATCTGTGA CTGTCTGTAG TTTGCTCTTG GTOCTTATCA TATTATTTTG TTACTTOCGG 720 

M TATAAAAG AC AAG AAACCAG AGCTOGATAC AGCATTGGGT TAG AACAGGA TGAAACTTAC 780 

20 ATTCCTCCTG G AGAATCCCT GAG AGACTTA ATTGAGCAGT CTCAGAGCTC AGGAAGTGGA 840 
TCAGGCCTCC CTCTGCTGGT CCAAAGGACT ATAGCT AAGC AG ATTCAG AT GGTG AAACAG 900 
ATTGGAAAAG GTCGCTATGG GGAAGTTTGG ATGGGAAAGT GGCGTGGCGA AAAGGTAGCT 960 
GTG AAAGTGT TCTTCACCAC AGAGG AAGCC AGCTGGTTCA GAGAG ACAGA AATATATCAG 1020 
ACAGTGTTGA TGAGGCATG A AAACATTTTG GGTTTCATTG CTGCAGATAT CAAAGGGACA 1080 

25 GGGTCCTGGA (XCAGTTGTA CCTAATCACA GACTATCATG AAAATGGTTC CCTTTATGAT 1140 
TATCTGAAGT CCACCACOCT AGACGCTAAA TCAATGCTG A AGTTAGOCTA CTCTTCTGTC 1200 
AGTGGCTTAT GTCATTTACA CACAGAAATC TTTAGTACTC AAGGCAAACC AGCAATTGCC 1260 
CATCGAGATC TGAAAAGTAA AAACATTCTG GTGAAG AAAA ATGGAACTTG CTGTATTGCT 1320 
GACCTGGGCC TGGCTGTTAA ATTTATTAGT GATACAAATG AAGTTGACAT ACCACCTAAC 1380 

30 ACTCGAGTTO GCACCAAACG CTATATGCCT CCAGAAGTGT TGGAGG AGAG CTTGAACAGA 1440 
AATCACTTCC AGTCTTACAT CATGGCTG AC ATGTATAGTT TTGGCCTCAT CCTTTGGGAO 1500 
GTTGCTAGGA GATGTGTATC AGGAGGT ATA GTGG AAG AAT ACCAGCTTCC TTATCATGAC 1560 
CTAGTGCCCA GTGACCCCTC TTATGAGG AC ATGAGGGAGA TTGTGTGCAT CAAGAAGTTA 1620 
CGCCCCTCAT TCCCAAACCG GTGGAGCAGT G ATG AGTGTC TAAGGCAGAT GGGAAAACTC 1680 

35 ATGACAGAAT GCTGGGCTCA CAATOCTGCA TCAAGGCTGA CAGCCCTGCG GGTTAAGAAA 1740 
ACACTTGCCA AAATGTCAG A GTCCCAGG AC ATTAAACTCTJGATAGG AGAG G AAAAGTAAG 1800 
CATCTCTGCA G AAAGCCAAC AGGTACTCTT CTGTTTGTGG GCAGAGCAAA AGACATCAAA 1860 
TAAGCATCCA CAGTAC AAGC CTTGAACATC GTCCTGCTTC CCAGTGGGTT CAGAOCTCAC 1920 
CTTTCAGGG A GCG ACCTGGG CAAAGACAGA GAAGCTCCCA GAAGGAGAGA TTGATOCGTG 1980 

40 TCTGTTTGTA GGCGGAGAAA OOGTTCGGTA ACTTGTTCAA GATATG ATGC AT 



45 



SEQ ID NO:183 BCV2 Protein seouence 

Protein Accession #: NP.001194 



1 U 21 31 41 51 
I I I I I I 

MLLRSAGKLN VGTKKEDGES TAPTPRPKVL RCKCHHHCPE DSVNNICSTD GYCFTMIEED 60 
DSGLPWTSG CLGLBGSDFQ CRDTPIPHQR RSIECCTERN ECNKDLHPTL PPLKNRDFVD 120 

50 GPIHHRALLI SVTVCSLLLV LULFCYFRY KRQETRPRYS IGLEQDETYI PPGESLRDLI 180 

EQSQSSGSGS GLPLLVQRTI AKQIQMVKQI GKGRYGEVWM GKWRGEKVAV KVFFTTEEAS 240 
WFRETEIYQT VLMRHENILG F1AADEKGTG SWTQLYLTED YHENGSLYDY LKSTTLDAKS 300 
MLKLAYSSVS GLCHLHTEIFSTQGKPAIAH RDUCSKNILV KKNGTCOAD LGLAVKFBD 360 
TNEVDIPPNT RVGTKRYMPP EVLDESLNRN HFQSY1MADM YSPGULWEV ARRCVSGGIV 420 

55 EEYQLPYHDL VPSDPS YEDM RETVCIKKLR PSFPNRWSSD ECLRQMGKLM TECWAHNPAS 480 
RLTALRVKKT LAKMSESQDI KL 



60 SEQ ID NO:184CBF9DNA sequence 

Nucleic Add Accession t: AC005383 

Coding Sequence: 328-2751 (underlined sequences correspond to start and stop codons) 

65 1 11 21 31 41 51 

I I I I I I 

GACAGTGTTC GCGGCTGCAC CGCTCGGAGG CTG6GTGACC CGCGTAGAAG TGAAGTACTT 60 

TTTTATTTGC AGACCTGGGC CGATGCCGCT TTAAAAAACG CGAGGGGCTC TATGCACCTC 120 

CCTGGCGGTA GTTCCTCCGA CCTCAGCCGO GTCGGGTCGT GCCGCCCTCT CCCAGGAGAG 180 

70 ACAAACAGGT GTCCCACGTG GCAGCCGCGC CCCGGGCGCC CCTCCTGTGA TCCCGTAGCO 240 

CCCCCTGGCC CGAGCCGCGC CCGGGTCTGT GAGTAGAGCC GCCCGGGCAC CGAGCGCT5G 300 

TCGCCGCTCT CCTTCCGTTA TATCAACATG O000CTTTCC T6TTGCT6GA GGCCGTCTGT 360 

GTTTTCCTCT TTTCCAQAGT GCCCCCATCT CTCCCTCTCC AGGAAGTCCA TCTAAGCAAA 420 

GAAACCATCG GGAAGATTTC AGCTGCCAGC AAAATGATGT GGTGCTCGGC TGCAGTGGAC 480 

75 A TCATG TTTC TGTTAGATGG GTCTAACAGC GTOGGGAAAG GGAGCTTTGA AAGGTCCAAG 540 

GACTTTGCCA TCACAGTCTG TOACGGTCTG GACATCAGCC CCGAGAGGGT CAGAGTGGGA 600 

GCATTCCAGT TCAGTTCCAC TCCTCATCTG GAATTCCCCT TGGATTCATT TTCAACCCAA 660 
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5 
.0 
.5 
10 
15 
SO 
$5 

to 

15 
50 
55 
50 
55 



CAGGAAGTGA 
CTTGCTCTGA 
CAGATCCTCA 
CAGCTGAAGG 
GAGCTGCATQ 
GAGGATGCCA 
ACGCCA6ACT 
GAGTTCGCTG 
GCACACTGTC 
AGGACCACCT 
CCAGAAGGAC 
TGTGCCCTGA 
GCGGGCACCA 
GCCGTGCTGA 
CTGGTGGCGO 
GGCATTCCCT 
CGTGGCTTCO 
CTCACTGAGT 
GAGCTCCTCC 
GGCAGCCCAA 
GAGCTGCAGG 
CTCGTCTTCA 
AGCTTTGTGA 
CTGGTGGTGT 
GCTGCGATGC 
ACCGCCCTOC 
GTCCCCAAAG 
GCCCAGAAGC 
AGTGAGGGTC 
GCCGACCTGC 
CCAGTCAACC 
GGGAGCTACC 
TGGAGCTCTT 
ATG6CTCCCG 
GGCACTGAAA 
TTCCCGOCGT 
ATGCTGCTTA 
TTGATGTGTA 
CTGCCACCTT 
CGTTCCTTTQ 
AGGCCTTTAC 
GCAGCTTTTC 
CTTGAGGGAC 
GGTCTCAGAC 
TGTGCATGGG 
ACCTTGAAGG 



1 
I 

MPPFLLLEAV 



AGGCAAGAAT 
AATACCTTCT 
TCATCGTCAC 
AAAGGGGTGT 
CACTGGCCAG 
CCAACGGCCT 
GCAGGGTCGA 
GCAATGCCOC 
CCTTCTACA6 
GCCCAGGCCC 
TGGADGGCTA 
AGCTGAGCCT 
CTCTGGACGG 
GCGAGGACTC 
TGCCTGTGQG 
TCCGTGGTGG 
GGAGCGCCAC 
CACACTCCGA 
TGCTGGGTGT 
AGCATGTGAT 
GGAAGCTGTG 
TGTTGGACAC 
GAAOCTGTGC 
ATGGCAGCCA 
TGOGGGCCAT 
TGCACATCTA 
CTOTGGTGGT 
TGAGGAACAA 
TGCGGAGGCT 
GGTACCACCA 
TCTGCAAACC 
GCTGCAAGTG 
GCTCTGTATG 
TGCAGGAGGG 
TGGTGCCTAC 
GGCCAGGACC 



AGTAAATACC 
TCCCTTGAGG 
CACACAATCA 
TAGAGCATCC 



GTTTOTGACT 
TGAATGTGAC 
CCCAGGTCTG 
TCTTC 



CAAGAGQATG 
GCACAGAGGG 
TGATGGGAAG 
CACTGTGTTT 
CGAGCCTAGA 
CTTCAGCACC 
GGCTCACCCC 
ATGCTGGAGA 
CTGGAAGAGA 
CTGTGACTCG 
CCAGTGCCTC 
GGAATGCAGO 
CTTCCTGCGO 
TCGGGCCCGA 
GGAGTACCAG 
CCCCACCCT6 
CAGGACAGGC 
GGATGAGGTT 
AGGCAGTGAG 
GGTCTACTCG 
CAGCCGGCAG 
CT CTOC CTC A 
CCTCCAGTTT 
GGTGCAGACT 
TAGOCAGGCC 
TGACAAAGTG 
GCTCACAGGC 
TGGCATCTCT 
TGCAGGTCOC 
GGACGTGCTC 
CAGCCCGTGC 
TCGGGATGGC 
TGTGAGCCAG 
CAGCAGCCGT 
CTTCTGGAAT 
ACTATTCTCA 
AGCAGCTGAT 
CACTTTCTGT 
ATAAACAAGG 
ATGCTCGCCA 
TTTGGAC6GC 
GAGACATTCT 
TCTTGCCGAC 
CAATTAACCA 
GAGGGCCACG 



GTTTTCAAAG 
TTGCCTGGAG 
TCCCAGGGGG 
GCTGTGGGGG 
GGGCAGCACG 
CTCAGCAGCT 
TGTGAGCACA 
GGATCGCGGC 
GTGTTCCTAA 
CAGCCCTGCC 
TGCCOGCTGG 
GTCGACCTCC 
GCCAAAGTCT 
GTGGGTGTGG 
GATGTG CC TG 
ACGGGCAGTG 
CAGGACOGGC 
GCGGGCCCAG 



GATCCTCAGG 
CGGCCAGGGT 



GCCTTCGGGC 
CCCTACCTAG 
ATGADCGTCC 
GGGAGAGGCO 
GTCTTGGTCG 
CGGGATTCCC 
ATTGAGTGGC 
ATGAATGAGG 
TGGGAGGGCC 
GGATGGATTC 
ACCCCTCCCA 
GTCTGTGCCC 
CTQAGGGAGG 
GTCACCCACA 
ACCTGCTGTG 
GGTCCTGAAG 
GAATGTTGTT 
GAAGGCCACG 
GGATGCATTT 
TGCCTTTTGT 
GCTTGGTTGA 
TAAAATCGTT 



GAGGGCGCAC 
GCAGAAATGC 
ATGTGGCACT 
TCAGGTTTCC 
TGCTGTTGGC 
CGGCCATCTG 
GGACGCTGGA 
GGACCCTTGC 
CCCACCCTGC 
AGAATGGAGG 
CCTTTGGAGG 
TCTTOCT CCT 
TCGTGAAGCG 
CCACATACAG 
ACCTGGTCTG 
CCTTGCGGCA 
CACGTAGAGT 
CGOGTCACGC 
CAGAGCTGGA 
ATCTGTTCAA 
GCCGGACACA 
AGAATTTTGC 
CTGACGTGAC 
TGGACACCAA 
GTGGGGTGGG 
AGAGGGGTGC 
CAGAGGATGC 
TGGGCGTGGG 
TGATCCACGT 
TGTGTGGAGA 
GCAGCTGCGT 
OCCACTGCGA 
TTGAGACGCC 
GCAACTACAQ 
CAGGTCCTTA 
AGGATGTCCC 
AACGATGTTG 
CCTTGTTGAG 
ACTTAAATTT 
GACACAGTAA 
GCCTTTCAAG 
GCATTGAGTC 
GTGTGGAAGA 



GGAGACGGAA 



GCCATCCAAG 
CAGGTG GQAG 
TGAGCAGGTG 
CTCCAGCGCC 
GATGGTCCGO 
GGTGCTGGCT 
CACCTGCTAC 
CACATGTGTT 
GGAGGCTAAC 
GGACAGCTCT 
GT TTSTCCGC 
CAGGGAGCTG 
GAOCCTCGAT 
GGCGGCAGAG 
GGTGGTTTTG 
AAGGGOGCGA 
GGAGATCACA 
CCAAATCCCT 
AGCCCTGGAC 
TCAGATGCAG 
ACAGGTCGGC 
ACOCACCCGG 
CTCAGCOGGC 
CCGGCCTGGT 
AGCCGTTCCT 
GCCTGTOCTA 
GGCAGCTTAC 
AGCCAAGCAQ 
CCTGCAGAAT 
GAACCGTGAG 
CCTGAGGCAC 
AGAAGGCCTG 
_GAATGTCTGC 
AACTGCAGCC 
TTGAAAAGTT 
GCTATGTCAT 
AOCGGCCTGA 
TGCCCAGCAG 
ATGGAAAGCA 
TGAAAGGGGG 
GACTTGGAAA 
GGGGCTGAGT 
AGCAGTGTOC 



720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 



HVFKG6RTBP 
PAVGVRPPRM 
PCEHKTLEKV 
SQPCQNGGTC 
RAKVFVKRPV 
LTGSALRQAA 
EAVRAELEEI 
SVGPENFAQH 
APYLGGVGSA 
SVLWGVGPV 
CMNEGSCVLQ 
RTPPSNYREG 



11 
I 

CVPLFSRVPP 
KHFAITVCDG 
ELALKYLLHR 
EELHALASEP 
REPAGNAPCW 
VPEGLDGYQC 
RAVLSEDSRA 
ERGFGSATRT 
TGSPKHVMVY 
QSFVRSCALQ 
OTALLEIYDX 
LSEGLRRLAG 
KGSYRCKCRO 
LGTEMVPTFW 



21 
I 

SLPLQEVHVS 
LDISPEKVRV 
GLPGGRNASV 
RGQHVLLAEQ 
RGSRRTLAVL 
LCFIAFGGEA 
RVGVATYSRE 
6QDRPRBWV 
SDPQDLPNQI 
FEVNPDVTQV 
VMTVQRGARP 
PRDSLIHVAA 



31 
I 

KETIGKXSAA 
GAFQFSSTPH 
PQILIIVTDG 
VEDA1NGLFS 



NCALKLSLEC 
LLVAVPVGEY 
LLTESHSEDE 
PELQGKLCSR 
GLWYGSQVQ 
GVPKAWVLT 
YADLRYHQDV 
EWSSCSVCVS 



HVFLTHFATC 
RVDLLFLLDS 
QDVPDLVWSL 
VAGPARHARA 
QRFGCRTQAL 
TAFGLDTKPT 
GGRGAEDAAV 
LIEWLCGEAK 
QGWILETPLR 



SEP m 110:185 C8FB Proteh sequence 
Protein Accession* none found 



41 51 
I I 

SKMMWCSAAV DIMFLLDGSN 
liEFPLDSFST QQEVKARHCR 
KSQGDVALPS KQLKERGVTV 



YRTTCPGPCD 
SAGTTLDGFL 
DGIPFRGGPT 
RBLLLLGVGS 
DLVFMLDTSA 
RAAMLRAISQ 
PAQKLRKNGI 
QPVNLCKPSP 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



NVCAPGP 



70 SEQ 10 NO:186 PAV1 DNA sequence 

Nucleic Add Accession* AFZ728S0 ' 

Coding Sequence 57-1520 (undenlned sequences correspond to start and slop codora) 

1 11 21 31 41 51 

75 | | | | l I 

TGCTACCCGC GCCCGGGCTT CTOGGOTGTT CCCCAACCAC GGCCCAGCCC TGCCACACCC 60 
CCCGCCCCCG GCCTCCGCAG CTCGGCATGG GCGOGGGGGT GCTCGTCCTO GGCGOCTOCG 120 
AGCCCGGTAA CCTGTCGTCG GCCGCACCGC TCCCCGACGG OSOGGCCAOC GCGGCGCGGC 180 
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TGCTQGTGCC 
CGCTGTCTCA 
TCGTGGCGGG 
TCACCAACCT 
TGCCGTTCGG 
AGCTQTGGAC 
TTGCCCTGGA 
GCGCGCG66C 
TOCCCATCCT 
ACCCCAAGTG 
OCTTCTACOT 
AGAAGCAGGT 
OQCCCTOQCC 
CCQCCGCCGC 
CGCGCCTCGT 
TCTTCACGCT 
AGCTGGTGCC 
TCAACCCCAT 
GCT6C0CGC6 
CGQGCTGTCT 
ACGACGATOT 
ACGGCGGGGC 
CCTCGOAATC 
GGGAACGAGG 
OCTCGTCTGA 
TTTGGGAAGG 



CGCQTCGCCG 
GCAGTGGACA 
CAATGTGCTG 
CTTCATCATG 
GGCCACCATC 
CTCAGTGGAC 
CCGCTACCTC 
GCG6G6CCTC 
CATGCACTGG 
CTGCGACTTC 
GCCCCTGTGC 
GAAGAAGATC 
CTCGCCCTCG 
CGCCGCCACC 
GGCCCTACGC 
CTGCTGGCTG 
CGACCGCCTC 
CATCTACTGC 
CAGGGCTGCC 
GGCCCGGCCC 
CGTCGGGGCC 
GGCGGGGGAC 
CAAGGT GTAQ 
AGATCTGTGT 
ATCATCCGAG 
GATGGGAGAG 



CCCGCCTCOT 
GCGGGCATGG 
GTGATCGTGG 
TCCCTGGCCA 
GTGGTGTGGG 
GTGCTGTGCG 
GCCATCACCT 
GTGTGCACCG 

GTCACCAACC 
ATCATGGCCT 
GACAGCTGCG 

GCCCCGCTGG 
GAGCAGAAGG 
CCCTTCTTCC 
TTCGTCTTCT 
CGCAGCCCCG 
CGCCGGCGCC 
GGACCCCCGC 
ACGCCGCCCG 
AGCGACTCGA 
GGCCCGGCGC 
TTACTTAAGA 
GCAAAGAGAA 
TGGCTTGCTG 



TGCTGCCTCC 
QTCTGCTGAT 
CCATCGCCAA 
GCGCCGACCT 
GCCGCTGGGA 
TGACGGCCAG 
CGCCCTTCCG 
TGTGGGCCAT 
AGAGCGACGA 
GGGCCTACGC 
TCGTGTACCT 
AGCGCCGTTT 

CCAACGGGCG 
CGCTCAAGAC 
TGGCCAACGT 
TCAACTGGCT 
ACTTCCGCAA 
ACGCGACCCA 
CATCGCCCGG 
CGCGCCTGCT 
GCCTGGACGA 
GGGGCGCGGA 
CCGATAGCAG 
AAGCCACGGA 
ATGTTCCTTG 



CGCCAGCGAA 
GGCGCTCATC 
GACGCCGCGG 
GGTCATGGGG 
GTACGGCTCC 
CATCGAGACC 
CTACCAGAGC 
CTCGGCCCTG 
GGCGCGCCGC 
CATCGCCTCG 
GCGGGTGTTC 
CCTCGGCGGC 
GOCGCCCGGA 
TGCGGGTAAG 
GCTGGGCATC 
GGTGAAGGCC 
GGGCTACGCC 
GGCCTTCCAG 
CGGAGACCGG 
GGCCGCCTCG 
GGAGCCCTGG 
GCCGTGCCGC 
CTCCGGGCAC 
GTGAACTCGA 
CCGTTGCACA 
TTG 



AGCCCCGAGC 
GTGCTGCTCA 
CTGCAGACGC 

TTCTTCTGCG 
CTGTQTQTCA 
CTGCTGACGC 
GTGTCCTTCC 
TGCTACAACG 
TCCGTAGTCT 
CGCGAGGCCC 
CGAGCGCGGC 
CCCCCGOGCC 
CGGCGGCCCT 
ATCATGGGCG 
TTCCACCGCG 
AACTCGGCCT 
GGACTGCTCT 
CCGCGCGCCT 
GACGACGACG 
GCCGGCTGCA 
CCCGGCTTCG 
GGCTTCCCAG 
AGCCCACAAT 
AAAAGGAAAG 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1S00 
1560 
1620 
1680 



Protein Accession #t 



mmsmmim&msm 

AA011176 



1 
I 

KGAGVLVLGA 
MGLLHALIVL 
WGRWEYGSFF 
TVWAISALVS 
AFVYLRVFRE 
LANGRAGKRR 
FFNWLGYANS 
PPSPGAASDD 



11 

I 

SEPGNLSSAA 
LIVAGNVLVI 
CELWTSVDVL 
FLPILMHWWR 
AQKQVKKIDS 
PSRLVALRBQ 
AFNPIIYCRS 
DDDDWGATP 



21 
I 

PLPDGAATAA 
VAXAKTPRLQ 
CVTASIBTLC 
AESDEARRCY 
CERRFLGGPA 
KALKTLGIIM 
PDFRKAPQGL 
PARLLEPWAG 



31 
I 

RLLVPASPPA 
TLTNLFIMSL 
VIALDRYLAI 
NDPKCCDPVT 
RPPSPSPSPV 
GVPTLCWLPP 
LCCARRAARR 
CNGGAAADSD 



41 
I 

SLLPPASESP 
ASADLVKGLL 
TSPPRYQSLL 
NRAYAIASSV 
PAPAPPPGPP 
FLANWKAFH 



SSLDEPCRPG 



51 
I 

EPLSQQWTAG 
WPPGATIW 
TRARARGLVC 
VSPYVPLCIM 
RPAAAAATAP 
RELVFORIiFV 
A5GCLARPGP 
PASESKV 



60 
120 
180 
240 
300 
360 
420 



Nucleic Add Accession*: AJ400877 
Cooing sequence: 



81*S)60 (underlined sequences correspond to start and stop codons) 



11 



31 41 



51 



21 

I 1 I I I I 

GGCGTCCGCG CACACCTCCC CGCGCCGCCG CCGCCACCGC CXXjCACTCCG CCGCCTCTGC 60 
(XGCAACCGC TGAGCCATCC AlfiGGGGTCG CGGGCCGCAA CCGTCCCGGG GCGGCCTGGG 120 
CGGTGCTGCT GCTOCTOCTG CTGCTGCCGC CACTGC1GCT GCTGGOGGGO GOCGTCCCOC 180 
CGGGTCGGGG CCGTGOCGCG GGGCCGCAGG AGGATGTAGA TGAGTGTGOC CAAGGGCTAG 240 
ATQ ACTGCCA TGCCGACGCC CTGTGTCAG A ACACACCCAC CTCCTACAAG TGCTCCTGCA 300 
AGCCTGGCTA CCAAGGGG A A GGCAGGCAGT GTG AGGACAT CGATG AATGT GGAAATGAGC 360 
TCAATGGAGG CTGTGTCCAT GACTGTTTGA ATATTCCAGG CAATTATCGT TGCACTTGTT 420 
TTOATGGCTT CATGTTGGCT CATGACGGTC ATAATTGTCT TGATGTGGAC GAGTGCCTGG 480 
AGAACAATGG CGGCTGCCAG CATACCTGTG TCAACGTCAT GGGGAGCTAT GAGTGCTGCT 540 
GCAAGGAGGG GTTTTTCCTG AGTGACAATC AGCACACCTG CATICACCGC TCGGAAGAGG 600 
GCCTGAGCTG CATGAATAAG GATCACGGCT GTAGTCACAT CTGCAAGGAG GCXXXAAGGG 660 
GCAGCGTCGC CTGTGAGTGC AGGCCTGGTT TTGAGCTGGC CAAGAACCAG AGAGACTGCA 720 
TCTTGACCTG TAACCATGGG AACGGTGGGT GCCAGCACTC CTGTGACGAT ACAGCCGATG 780 
GCCCAGAGTG CAGCTGCCAT CCACAGTACA AGATGCACAC AGATGGGAGG AGCTGCCTTG 840 
AGCGAGAGGA CACTGTCCTG GAGGTGACAG AGAGCAACAC CACATCAGTG GTGGATGGGG 900 
ATAAACGGGT GAAAGGGCGG CTGCTCATGG AAACGTGTGC TGTCAACAAT GGAGGCTGTG 960 
ACOGCACCTG TAAGGATACT TCG ACAGGTG TOCACTGCAG TTGTCCTGTT GGATTCACTC 1020 
TCCAGTTGGA TGGGAAGACA TGTAAAGATA TTGATGAGTG CCAGACCCGC AATGGAGGTT 1080 
GTGATCATTT CTGCAAAAAC ATCGTGGGCA Gill JU ACTG CGGCTGCAAG AAAGGATTTA 1140 
AATTATTAAC AGATGAGAAG TCTTGCCAAG ATGTGGATGA GTGCTCTTTG GATAGGACCT 1200 
GTGACCACAG CTGCATCAAC CACCCTGGCA CATTTGCTTG TGCTTGCAAC CGAGGGTACA 1260 
CCCTGTATGG CITCACCCAC TGTGGAGACA CCAATG AGTG CAGCATCAAC AACGGAGGCT 1320 
GTCAGCAGOT CTGTGTG AAC ACAGTGGGCA GCTATGAATG OCAGTGGCAC OCTGGGTACA 1380 
AGCTCCACTG GAATAAAAAA GACTGTGTGG AAGTGAAGGG GCTCCTGCCC ACAAGTGTGT 1440 
CACCCCGTGT GTOCCTGCAC TGCGGTAAGA GTGGTGGAGG AGACGGGTGC TTCCTCAGAT 1500 
GTCACTCTCG CATTCACCTC TCTTCAGATG TCACCACCAT CAGGACAAGT GTAACCTTTA 1560 
AGCTAAATGA AGGCAAGTGT A G 'l i 1 G AAAA ATGCTGAGCT GTTTCCCGAG GGTCTGCGAC 1620 
CAGCACTACC AGAGAAGCAC AGCTCAGTAA AAGAGAGCTT CCGCTACGTA AACCTTACAT 1680 
GCAGCTCTGG CAAGCAAGTC CCAGG AGCCC CTGGCCGACC AAGCACCCCT AAGGAAATGT 1740 
TTATCACTGT TGAGTTTGAG CTTGAAACTA AOCAAAAGGA GGTGACAGCT TCTTGTGACC 1800 

378 



WO 02/30268 



TGAGCTGCAT OCTAAAGCG A ACOGAGAAGC GGCTCCGTAA AGCCATCCGC ACGCTCAGAA 1 860 
AGGCCGTCCA CAGGGAGCAG TTTCACCTOC AGCTCTCAGG CATGAAGCTC GACGTGGCTA 1920 
AAAAGCCTCC CAGAACATCT GAACGCCAGG CAGAGTCCTG TGGAGTGGGC CAGGGTCATG 1980 
CAOAAAACCA ATGTGTCAGT TGCAGGGCTG GGACCTATTA TOATGGAGCA COAGAACGCT 2040 
GCATTTTATG TCCAAATGGA ACCTTOCAAA ATOAGGAAGG ACAAATGACTTGTGAAOCAT 2100 
GCCCAAG ACC AGGAAATTCT GGGGCCCTG A AGACOOCAGA AGCTTGGAAT ATGTCTOAAT 2160 
GTGGAGGTCT GTGTCAACCT GGTGAATATT CTGCAG ATGG CTTTGCAOCT TGCCAGCTCT 2220 
GTGCCCTGGG CACGTTCCAG CCTGAAGCTG GTCGAACTTC CTGC1TCCCC TGTGGAGGAG 2280 
GCCTTGCCAC CAAACATCAG GGAGCTACrr CCnTCAGGA CTGTGAAACC AGAGTTCAAT 2340 
GTTCACCTGG ACATTTCTAC AACACCAOCA CTCACCGATG TATTCGTTGC CCAGTGGGAA 2400 
CATAOCAGCC TGAATTTGGA AAAAATAATT GTGTTTCTTG COCAGGAAAT ACTACGACTG 2460 
ACTTTGATGG CTCCACAAAC ATAACCCAGT GTAAAAACAG AAGATGTGGA GGGGAGCTGG 2520 
GAGATTTCAC TGGGTACATT GAATCCCCAA ACTACCC AGG CAATTAOCCA GCCAACAOCG 2580 
AGTCTACGTG GAOCATCAAC CCACCCCCCA AGCGCCGCAT CCTGATOGTG GTOCCTGAG A 2640 
TCTTCCTGCC CATAGAGGAC GACTGTGGGG ACTATCTGGT G ATGCGG AAA AO CTCTTC AT 2700 
CXAATTCTCT GACAACATAT GAAACCTGCC AG ACCTACGA ACGCCCCATC GCCTTCACXT 2760 
OCAGGTCAAA GAAGCTGTGG ATTCAGTTCA AGTCCAATG A AGGG AACAGC GCTAGAGGGT 2820 
TCCAGGTCCC ATACGTGACA TATGATGAGG ACTACCAGGA ACTCATTGAA GACATAGTTC 2880 
GAGATGGCAG GCTCTATGCA TCTGAG AACC ATCAGGAAAT ACTTAAGGAT AAGAAACTTA 2940 
TCAAGGCTCT GTTTGATGTC CIGGCCCATC COCAGAACTA TTTCAAGTAC ACAGCCCAGG 3000 
AGTCCOGAGA GATGTTTOCA AGATCGTTCA TCOGATTGCT ACGTTCCAAA GTGTOCAGGT 3060 
TTTTGAGACC TTACAAAIQA CTCAGOCCAC GTGOCACTCA ATACAAATGT TCTGCTATAG 3120 
GGTTGGTGGG ACAGAGCTGT CTTCCTTCTG CATGTCAGCA CAGTCGGGTA TTGCIGCCTC 3180 
OCGTATCAGT GACTCATTAG AGTTCAATTT TTATAGATAA TACAGATATT TTGGTAAATT 3240 
GAACTTGGTT TTTCnTOCC AGCATOGTGG ATGTAGACTG AG AATGGCTT TGAGTGGCAT 3300 
CAGCTTCTCA CTGCTGTGGG CGGATGTCTT GGATAGATCA CGGGCTGGCT GAGCTGGACT 3360 
TTGGTCAGCC TAGGTGAGAC TCACCTGTOC TTCTGGGGTC TTACTOCTOC 1CAAGGAGTC 3420 
TGTAGTGOAA AGGAGGCCAC AGAATAAGCT GCTTATTCTG AAACTTCAGC TTCCTCTAGC 3480 
CGGGCOCTCT CTAAGGGAGC OCTCTGCACT CGTGTGCAGG CTCTG AOCAG GCAGAACAGG 3540 
CAAGAGGGGA GGGAAGGAGA CCCCTGCAGG CTCCCTCCAC CCACCTTGAG AGCTGGGAGG 3600 
ACTCAGTTTC TCCACAGCCT TCTCCAGCCT GTGTGATACA AGTTTGATOC CAGGAACTTG 3660 
AGTTCTAAGC AGTGCTCGTG AAAAAAAAAA GCAGAAAGAA TTAGAAATAA ATAAAAACTA 3720 
AGCACTTCTG GAGACAT 

SEQ IDilOr189 BC02 Protein sequence 

Partem Accession ti CAB82285 

1 11 21 31 41 51 
I I I I I I 

MGVAGRNRPG AAW AVLLLLL LLPPLLIXAG AVPPGRGRAA GPQEDVDECA QGLDDCHADA 60 
LCQNTPTSYK CSCKPGYQGE GRQCEDIDEC GNELNGGCVH DCLNIPGNYR CTCFDGFMLA 120 
HDGHNCLDVD ECLENNGGCQ HTCVNVMGSY ECCCKEGFFL SDNQHTdHR SEEGLSCMNK 180 
DHGCSHICKB APRGSVACEC RPGFELAKNQ RDOLTCNHG NGGCQHSCDD TADGPBCSCH 240 
PQYKMHTDGR SCLEREDTVL EVTESNTTSY VDGDKRVKRR LLMETCAVNN GGCDRTCKDT 300 
STGVHCSCPV GFTLQLDGKT CKDIDECQTR NGGCDHFCKN IVGSFDCGCK KGFKLLTDEK 360 
SCQDVDECSL DRTCDHSON HPGTFACACM RGYTLYGFTH CGDTNECSIN NGGCQQVCVN 420 
TVGSYECQCH PGYKLHWNKK DCVEVKGIXP TSVSPRVSLH CGKSGGGDGC FLRCHSGIHL 480 
SSDVTORTS VTFKLNEGKC SLKNAELFPE GLRPALPEKH SSVKESFRYV NLTCSSGKQV 540 
PGAPGRPSTP KEMFTTVEFE LETNQKEVTA SCDLSCJVKR TEKRLRKAIR TLRKAVHREQ 600 
FHLQLSGMNL DVAKKPPRTS ERQAESCGVG QGHAENQCVS CRAGTYYDGA REROLCPNG 660 
TFQNEEGQMT CEPCPRPGNS GALKTPEAWN MSECGGLCQP GEYS ADGFAP CQLCALGTPQ 720 
PEAGRTSCFP CGGGLATKHQ GATSPQDCET RVQCSPGHFY NTTTHRQRC PVGTYQPEFG 780 
KNNCVSCPGN TTTDFDGSTN ITQCKNRROG GELGDFTGYI ESPNYPGNYP ANTECTWTIN 840 
FFPKRRILIV VFEEFLP1ED DCGDYLVMRK TSSSNSVTTY ETCQTYERPI AFTSRSKKLW 900 
IQFKSNEGNS ARGFQ VPYVT YDEDYQEUE DIVRDGRLYA 5ENHQHLKD KKUKALFDV 960 
LAHPQNYFKY TAQESREMFP RSFIRLLRSK VSRFLRPYK 

SEQ IP H 0:190 BFQ1 ONA seouence 

Nudete Add Accession*: AF007170 

Coding sequence: 1-1^25 (unctertned sequences correspond to stop codon) 

1 11 21 31 41 51 
I I I I I I 

AAGGAGGOGG CCTCCGGGAA AAGCG AOCGC AGG ACTOCTG AGAGCAGCCT CCATG AGGCC 60 
CTGGACCAGT GCATG ACCGC GCTGGACCTC TTOCTCACCA AOCAGTTCTC AGAAGCACTC m 
AGCTACCTCA AGCCCAGAAC CAAGGAAAGC ATGTAOCACT CACTGACATA TGOCACCATC 180 
CTGGAGATGC AGGCCATGAT GACCTTTGAC CXTTCAGGACA TCCTGCTTGC CGGCAACATG 240 
ATGAAGGAGG CACAGATGCT GTGTCAGAGG CACCGGAGGA AGTCTTCTGT AACAGATTCC 300 
TTCAGCAGCC TGGTGAACCG CCCCACGCTG GGGCAATTCA CTGAAGAAGA AATOCAOGCT 360 
GAGGTCTGCT ATGCAGAGTG CCTGCTGCAG CG AGGAGOCC TGACCTTCCT GCAGGACGAG 420 
AACATGGTG A GCTTCATCAA AGGCGGCATC AAAGTTCGAA ACAGCTACCA GAOCTACAAG 480 
GAGCTGGACA GCCTTGTTCA GTCXJTCACAA TACTGCAAGG GTGAGAACCA CCCGCACTTT 540 
GAAGGAGGAG TGAAGCTTGG TGTAGGGGOC TTCAACCTC A CACTGTOCAT GCTTOCTACT 600 
AGGATOCTG A GGCPGTTGGA GnTGTGGGG TTTTCAGGAA ACAAGGACTA TGGGCIX5CTG 660 
CAGCTGG AGG AGGGAGCGTC AGGGCACAGC TTCCGCTCTG TGCTCTGTGT CATGCTCCTG 720 
CIGTGCTAOC ACACCTTGCT CACCTTOGTG CTDGGTACTG GGAAOGTCAA CATCG AGG AG 780 
GCCGAGAAGC TCTTGAAGGC CTACXTGAAC OGGTACOCTA AGGGTGCCAT CTTOCTGTTC 840 
TTTGCAGGGA GGATTG AAGT CATTAAAGGC AACATTO ATG CAGOCATCGG GCX3TTTCG AG 900 
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GAGTGCTGTG AGGCCCAGCA GCACTGGAAG CAGTTTCACC ACATGTGCTA CTGGGAGCTG 960 
ATGTGGTGCT TCACCTACAA GGCOCACTGG AAGATGTCCT ACTTCTACGC CGACCTGCTC 1020 
AGCAAGG AGA ACTCCTGGTC CAAGGCCACC TACATTTACA TGAAGGCCGC CTACCTCAGC 1080 
ATGTTTGGGA AGGAGGACCA CAAGCCGTTC GGGGACGACG AAGTGGAATT ATTTCGAGCT 1 140 
GTGCCAGGCC TGAAGCTCAA GATTGCTGGG AAATCTCTAC CCACAG AGAA GTTTGCCATC 1200 
CGGAAGTCCC GGGGCTACTT CTCCTCCAAC CCTATCTCGC TGCCAGTCOC TGCTCTGGAA 1260 
ATGATGTACA TCTGGAACGG CTAOGCCGTG ATTCGGAAGC AGCOGAAACT CAOGGATGGG 1320 
ATACTTGAGA TTATCACTAA GGCTGAAGAG ATGCTGGAGA AAGGOOCAQA GAAOGAGTAC 1380 
TCAGTGGATG ACQ AGTGCTT GGTGAAATTG TTGAAAGGCC TGTGTCTG AA ATACCTGGGC 1440 
CGTGTCCAGG AGGCCGAGGA GAATTTTAGG AGCATCTCTG CCAATGAAAA GAAGATTAAA 1500 
TATGAOCACT ACTTGATCCC AAAOGCCCTG CFGGAGCTGG CCCTGCTGCT TATGGAGCAA 1560 
GACAGAAACG AAG AGGCCAT CAAACTTTTG GAATCTGOCA AGCAAAACTA CAAGAATTAC 1620 
TGCATGGAGT CAAGG ACACA CTTTCGAATC CAGGCAGOCA CACTCCAAGC CAAGTCTTCC 1680 
CTAGAGAACA GCAGCAGATC CATGGTCTCA TCAGTGTCCT TGX^QCTTTG TGCAGCAGTT 1740 
CCGGGCTGGA AGACAGAGAC AGCTGGACAG AGCTOCTGAA AACATTTCAA AATAOOOOCT 1800 
COOOCTGOOC TGOCCIXjOCT TTGGGGTOCA CCGGCACTOC AGTTGGATGG CACAACATAG 1860 
TGTATCCGTG CAGAAGCCGA GCTGGCATTT TCACCAGTGT AGCCAAGGGC CTTTGOCAAG 1920 
GGCAGAGCAG GTGOAGCOCT CTGCCTGOCC TATCACACAT ACGGGTACTT GCTTTTCACT 1980 
GTGATGTTTA AG AG AATGTA TGAACAGTTF ACATTTTCCT TAGAAATACA TTGATGGGAT 2040 
CACAGTTGGC TTTAAAAAOC AACAACAATC AAOCACCTGT AAGTCTTTGT CTTCAOCTAT 2100 
TATCA1CFGG AGGTAAATCT CTTTATATGA TGATGOCAAA GGGGAAATTG CTTTTCAAAT 2160 
TCAGCAAGTT CTCAGCTTGT GTG ACGGAAG GTCCTTCAGA GGACCTGAGG AATGCCTGGG 2220 
AGAGGCTAAG CCTCAGGCTT CAATGCXTCT GGGGTTGGGC ATGAGGATGT ACACAG ACAC 2280 
CCACTACCTT ACTACTCACA CTTCATTTCA CTCCTTTTGT AAATTTOCAA TTTAAAAATC 2340 
AAGCACGTCT TTTTAGTGAG ATAAAATCTG AGCTCTTCTG TAGAAAAATC AATCTCTAOC 2400 
AGTAGAAAAT GOCAGGGCTT GATGGAAG AG CTGTGTAGCC CTTTCTATGC CAAAGCCAGG 2460 
AAATTTGGGG GGCAGGAGGA GGTTCTCAGA ATCCAGTCTG TATCTTTGCT GTATGCCAAA 2520 
CTGAAACCAC TGGGAATAAT TTATGAAACA TAAAAATCTT CTGTACTTCA CTCCAAGGTA 2580 
CATTTGCTTA CTGACAGCAT TTTTGTTAAA ACTGTTATTC TTG AAAAAAA AAAAAAAAAA 2640 
AA 

gEQIPNfr191 frp^nsflwrcs 

Protein Accession*: AAC39582 

1 11 21 31 41 51 
I I I I I I 

MTALDLFLTN QFSEALSYLK PRTKESMYHS LTYATILEMQ AMMTFDPQDl LLAGNMMKEA 60 
QMLCQRHRRK SSVTDSFSSL VNRPTLGQFT FF.EIHAEVCY AECLLQRAAL TFLQDENMVS 120 
FIKGGIKVRN SYQTYKELDS LVQSSQYCKG ENHPHFEGGV KLGVGAFNLT LSMLPTRILR 180 
LLEFVGFSGN KDYGLLQLEE GASGHSFRS V LCVMLLLCYH TFLTFVLGTG NVNIEEAEKL 240 
LKPYLNR YPK G AIFLFFAGR IEVIKGNIDA AIRRFEECCE AQQHWKQFHH MCYWELMWCF 300 
TYKGQWKMSY FYADLLSKEN CWSKATYIYM KAAYLSMPGK EDHKPFGDDE VELFRAVPGL 360 
KUQAGKSLP TEKFAIRKSR RYFSSNPISL PVPALEMMYI WNGYAVIGKQ PKLTDGILEI 420 
ITKAEEMLEK GPENEYSVDD ECLVKLLKGL CLKYIjGRVQE AEENFRSISA NEKKDCYDHY 480 
UFNALLELA LLLMEQDRNE EAIKLLESAK QNYKNYSMES RTHFRIQAAT LQAKSSUBNS 540 
SRSMVSSVSL 

SEQ ID MO:192 BF06 DMA sequence 

NxIeJc Add Accession #: KM.032583 

Coding sequence: 1^044 (underiined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGACTAGGA AG AGGACATA CTGGGTGCCC AACTCTTCTG GTGGCCTCGT GAATCGTGGC 60 
ATCXjACATAG GCGATGACAT GGTTTCAGGA CTTATTTATA AAAOCTATAC TCTCCAAGAT 120 
GGCCCCTGGA GTCAGCAAGA GAGAAATCCT GAGGCTCCAG GGAGGGCAGC TGTCCCACCG 180 
TGGGGGAAGT ATGATGCTOC CTTGAGAACC ATGATTCGCT TCCGTOCCAA OCOGAGGTTT 240 
CCTGCCCCCC AGCCCCTGGA CAATGCTGGC CTGTTCTCCT ACCTCACCGT GTCATGGCTC 300 
AOGCOGCTCA TGATGCAAAG CTTAOGG AGT OGCTTAGATG AGAACACCAT CCCTCCACTG 360 
TCAGTOCATG ATGCCTCAG A CAAAAATGTC CAAAGGCTTC AOCGOCTTTG GGAAGAAG AA 420 
GTCTCAAGGC GAGGGATTGA AAAAGCTTCA GTGCTTCTGG TGATGCTGAG GTTCCAGAGA 480 
ACAAGGTTGA TTTTCGATGC ACTTCTGGGC ATCTGCTTCT GCATTGCCAQ TGTACTCGGG 540 
OCAATATTGA TTATACCAAA GATCCTGGAA TATTCAGAAG AGCAGTTGGG GAATGTTGTC 600 
CATGG AGTGG GACTCTGCTT TGCOCTTTTT CTOCCGAAT GTGTGAAGTC TCTG AGTTTC 660 
TOC70 CAQTT GGATCATCAA OCAAOGCACA GCCATCAGGT TCCGAGCAGC TGTITCCTCC 720 
TTTGOC11 lli AG AAGCICAT CCAATTTAAG TCTGTAATAC ACATCACCTC AGGAGAGGCC 780 
ATCAGCTTCT TCAOCGGTGA TGTAAACTAC CTCTTTGAAG GGGTGTGCTA TGGACOCCTA 840 
GTACTG ATCA OCTGCGCATC GCTGGTCATC TGCAGCATTT CTTOCTACTT CATTATTGG A 900 
TACACTGCAT TTATTGCCAT CTTATGCTAT CTCCTGGTTT TCCCACTGGC GGTATTCATG 960 
ACAAG AATGG CTGTG AAGGC TCAGCATCAC ACATCTG AGG TCAGOGAOCA GCGCATCCGT 1020 
GTG ACCAGTG AAGTTCTCAC TTGCATT AAG CTGATTAAAA TGTACACATG GGAGAAAOCA 1080 
TTTGCAAAAA TCATTGAAGG TATGGAAAGT CTGACnTCT GCTCCAAAOC TGGTGATGGC 1140 
ATGGCCTTCA GCATGCTGGC CTCCTTGAAT CTCCTTCGGC TGTCAGTGTT CTTTGTGCCT 1200 
ATTGCAGTCA AAGGTCTCAC GAATTCCAAG TCTGCAGTGA TGAGGTTCAA G AAGTITTTC 1260 
CTCCAGGAGA GCCCTGTTTT CTATGTCCAG ACATTACAAG ACCCCAGCAA AGCTCTGGTC 1320 
TTTGAGGAGG CCACCTTGTC ATGGCAACAG ACCTGTCCCG GGATCGTCAA TGGGGCACTG 1380 
GAGCTGG AGA GGAACGGGCA TGCTTCTGAG GGGATGACCA GGOCTAGAGA TCCCCTCGGG 1440 
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CCAGAGGAAG AAGGGAACAG CCTGGGCOCA OAGTTGCACA AGATCAACCT GGTGGTGTCC 1500 
AAGGGGATG A TGTTAGGGGT CTGCGGCAAC ACGGGGAGTG GTAAGAGCAG CCTGTTGTCA 1560 
GCCATCCTGG AGGAG ATGCA CTTGCTCG AG GGCTCGGTGG GGGTGCAGGG AAGCCTGGOC 1620 
TATOTCOOOC AGCAGGCCTG GATOGTCAGC GGGAACATCA GGGAGAACAT OCTCATGGG A 1680 
GGCGCATATG ACAAGGCCCO ATACCTCCAG GTGCTCCACT GCTGCTCCCT GAATCGGGAC 1740 
CTGGAACTTC TGOOCTTTGG AGACATGACA OAGATTGGAG AGOGGGGOCTCAAOCTCTCT 1800 
GGGGGGCAGA AACAGAGGAT CAGOCTGGOC CGCGCOGTCT ATTCOGACOG TCAGATCTAC 1860 
CTGCTGGACG ACCCCCTGTC TGCTGTGGAC GCCCAOGTGG GGAAGCACAT TTTTGAGGAG 1920 
TT3CATTAAGA AGACACTCAG GGGGAAGACG GTCGTCCTGG TGACOCAOCA GCTGCAGTAC 1980 
TTAGAATTTT GTGGCCAGAT CATTTTGTTG GAAAATGGGA AAATCTGTGA AAATGGAACT 2040 
CACAGTGAGT TAATGCAGAA AAAGGGGAAA TATGCCCAAC TTATGCAGAA GATGCACAAG 2100 
GAAGCCACTT OGGACATGTT GCAGGACACA GCAAAG ATAG CAGAG AAGCC AAAGGTAGAA 2160 
AGTCAGGCrc TGGCCACCTC CCTGGAAG AG TCTCTCAAGG GAAATGCTGT GCCGGAGCAT 2220 
CAGCTCACAC AGGAGGAGGA GATGGAAGAA GGCTCCTTGA GTTGG AGGGT CTACCAOCAC 2280 
TACATCCAGG CAGCTGGAGG TTACATGGTC TCTTGCATAA TTTTCTTCTT CGTGGTGCTG 2340 
ATOGTCTTCT TAACGATCTT CAGCTTCTGG TGGCTGAGCT ACTGGTTGGA GCAGGGCTCG 2400 
GGGACCAATA GCAGCOGAG A G AGCAATGGA ACCATGGCAG AGCTCGGCAA CATTGCAGAC 2460 
AATCCICAAC TOTCCTTCTA CCAGCTGGTG TACGGGCTCA ACGOOCTGCT OCTCATCTGT 2520 
GTCGGGGTCTGCTCCTCAGG G A1 1 1 1C ACC AAAGTCACGA GGAAGGCATC CACGGCCCTO 2580 
CACAACAAGC TCTTCAACAA GGTnTOCGC TGCCCCATGA Gl 1 1CH 10 A CACCATCOCA 2640 
ATAGGCCGGC TTTTGAACTG CTTCGGAGGG GACTTGGAAC AGCTGGACCA GCTCTTGCCC 2700 
ATCTTTTCAG AGCAGTTCCT GGTCCTOTCC TTAATGGTGA TOGCCGTCCT GTTGATTGTC 2760 
AGTGTGCTOT CTCCATATAT CCTGTTAATG GGAGCCATAA TCATGGTTAT TTGCTTCATT 2820 
TATTATATGA TGTTCAAGAA GGCCATCGGT GTGTTCAAGA GACTGGAGAA CTATAGCCGG 2880 
ICIOCT 1 1 AT TCTCOCACAT CCTCAATTCT CTGCAAGGOC TGAGCTCCATOCATGTCTAT 2940 
GGAAAAACTG AAGACTTCAT CAGGCAGTTT AAG AGGCTGA CTG ATGCGCA GAATAACTAC 3000 
CTGCTGTTGT TTCTATCTTC CACACGATGG ATGGCATTGA GGCTGGAGAT CATGACCAAC 3060 
CTTOTGAOCT TGGCTGTTGC CCTGTTCGTG GCTTTTGGCA TTTDCTOCAC CCCCTACTCC 3120 
TTTAAAGTCA TGGCTGTCAA CATCGTGCTG CAGCTGGGGT CCAGCTTOCA GGOCACTGOC 3180 
OGGATTGGCT TGGAGACAGA GGCACAGTTC ACGGCTGTAG AGAGGATACT GCAGTACATG 3240 
AAGATGTGTG TCTOGGAAGC TCCTTTACAC ATGGAAGGCA CAAGTTGTCC CCAGGGGTGG 3300 
GCACAGCATG GGGAAATCAT ATTTCAGG AT TATCACATGA AATACAG AGA CAACACACCC 3360 
ACCGTGCTTC AOGGCATCAA CCTG ACCATC CGCGGCCACG AAGTGGTGGG CATCGTGGG A 3420 
AGG ACGGGCT CTGGGAAGTC CTCCTTGGGC ATGGCTCTCT TCCGCCTGGT GGAGCCCATG 3480 
GCAGGCCGGA TTCTCATTGA GGGOGTGGAC ATTTGCAGCA TOGGCCTGGA GGACTTGCGG 3540 
TOCAAGCTCT CAGTGATCCC TCAAGATCCA GTGCTGCTCT CAGG AAOCAT CAGATTCAAC 3600 
CTAGATCOCT TTGACOGTCA CACTGAOCAG CAG ATCTGGG ATGCCTTGGA GAGGACATTC 3660 
CTGAOCAAGG CCATCTCAAA GTTGGCCAAA AAGCTGCATA CAGATGTGGT GGAAAACGGT 3720 
GGAAACTTCT CTGTGGGGGA GAGGCAGCTG CTCTGCATTG GCAGGGCTGT GCTTCGCAAC 3780 
TCCAAGATCA TCCTTATCG A TG AAGCCACA GOCTCCATTG ACATGGAGAC AGACACCCTG 3840 
ATCCAGCGCA CAATCCGTG A AGCCTTCCAG GGCTGGAOOG TGCTOGTCAT TGCCCAGCGT 3900 
GTCACCACTG TGCTGAACTG TGACCACATC CTGGTTATGG GCAATGGGAA GGTGGTAGAA 3960 
TTTGATCGGC CGGAGGTACT GCGGAAGAAG CCTGGGTCAT TGTTCGCAGC CCTCATGGCC 4020 
ACAGCCACTT CTTCACTGAG AJAAGGAGAT GTGG AGACTT CATGG AGGCT GGCAGCTGAG 4080 
CTCAGAGGTT CACACAGGTG CAGCTTCGAG GOCCACAGTC TGCG ACCTTC TTGTTTGGAG 4140 
ATGAGAACTT CTCCTGGAAG CAGGGGTAAA TGTAGGGGGG GTGGGGATTG CTGGATGGAA 4200 
ACCCTGGAAT AGGCTACTTG ATGGCTCTCA AGACCTTAGA ACOCCAGAAC CATCTAAGAC 4260 
ATGGGATTCA GTGATCATGT GGmCTCCTT TTAACTTACA TGCTGAATAA TTTTATAATA 4320 
AGGTAAAAGC TTATAGTTTT CTGATCTGTG TTAGAAGTGY TGCAAATGCT GTACTG ACTT 4380 
TGT AAAAT AT AAAACT AAGG AAAACTCAAA AAAAAAAAAA AAAAAAA 

SEA IP NftlW PFW Frfft^n SWVCTW 

Protein Accession* NP_1 15972.1 

1 11 21 31 41 51 
I I I I I I 

MTRKRTYWVP NSSGGLVNRG IDIGDDM VSG UYKTYTLQD GPWSQQERNP EAPGRAAVPP 60 
WGKYDAALRT MIPFRPKPRF PAPQPLDNAG LFSYLTVS WL TPLMIQSLRS RLDENTIPPL 120 
SVHDASDKNV QRLHRLWEEE VSRRGIEKAS VLLVMLRFQR TRLIFDALLG ICFCIAS VLG 180 
PUJIPKILE YSEEQLGNW HGVGLCFAUF LSECVKSLSF SSS WHNQRT A1RFRAA VSS .240 
FAFEKLIQFK SVHHTSGEA BFFTGDVNY LFEGVCYGPL VUTCASLVI CSISSYFDG 300 
YTAHAHjCY LLVFPLAVFM TRMAVKAQHH TSBVSDQRIR VTSEVLTOK LIKMYTWEKP 360 
FAKIIEGMES LTPCSKPGDG MAFSMLASLN LLRLSVFFVP IAVKGLTNSK SAVMRFKKFF 420 
LQESPVFYVQ TLQDPSKALV FEEATLSWQQ TCPGIVNGAL ELERNGHASE GMTRPRDALO 480 
PEEEGNSLGP ELHKINLWS KGMMLG VCGN TGSGKSSULS AILEEMHLLE GSVGVQGSLA 540 
YVPQQAWIVS GN1RENILMG G AYDKARYLQ VLHCCSLKRD LELUPPGDMT EIGERGLNLS 600 
GGQKQR1SLA RAVYSDRQIY LLDDPLSAVD AHVGKHIFEE CIKKTLRGKT WLVTHQLQY 660 
LETCGQHLL ENGKICENGT HSELMQKKGK YAQUQKMHK EATSDMLQDT AKIAEKPKVE 720 
SQALATSLEE SLNGNAVPEH QLTQEEEMEE GSLSWRVYHH YIQAAGGYMV SCUFFFWL 780 
IVFLTIFSFW WLSYWLEQGS GTNSSRESNG TMADLGNIAD NPQLSFYQLV YGLNALLUC 840 
VGVCSSGDFT KVTRKASTAL HNKLFNKVFR CPMSFFDTtP K3RLLNCFAG DLEQLDQLLP 900 
IFSEQFLVLS LMVIAVLUV SVLSPY1LLM GAUMVICH YYMMFKKAIG VFKRLENYSR 960 
SFLPSHXLKS LQGLSSIHVY GKTEDFISQF KRLTDAQNMY LLLFLSSTRW MALRLEIMTN 1020 
LVTLA VALFV AFGISSTPYS FKVMAVNIVL QLASSPQATA RIGLETEAQF TAVERILQYM 1080 
KMCVSEAPLH MEGTSCPQGW PQHGEUFQD YHMKYRDNTP TVLHGINLTI RGHEYVGJVG 1140 
RTGSGKSSLG MALFRLVEPM AGRIUDGVD ICSK3LEDLR SKLSVDPQDP VLLSGTIRFN 1200 
LDPFDRHTDQ QIWDALERTF LTKA1SKFFK KLHTD WENG GNFSVGERQL LOARAVLRN 1260 
SKlHiDEAT ASIDMETDTL IQRTIREAFQ GCTVLVIAHR VTTVLNCDHI LVMGNGKWE 1320 
FDRPEVLRKK PGSLFAALMA TATSSLR 
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gE9fPM0;1MPHP9PNAgggvence 
NudeJc Acid Accession #: 
Coding sequence: 



AA983251 

1-1749 (undefined sequences conespond to start and stop codons) 



l 



11 



21 



31 



41 



ATGCTGTCTG GCTTCTTGAT GAGTCCCAGT ACCCAGCACA GAGCACAGTA CACTCCCGGA 60 

GGAAAGAAAC TT0CGT6GGA G6CTTCCATC GGTGCGCACA CCTCCCGAGG GCGAGGCAGC 120 

GACCGGGAGA GGGAGAGCCG GCCGGAGGCT GCCGGGCTCC TGTGG6ACCG CGCTGCA6CC 180 

GG66A6GC66 AGAAGGGGAA CCGGGGC6AG CCGCCCGCCT GGATCCGCGC CCAGCAGCAQ 240 

CCGCGGCCGC CGCCAGCTGG GCACGCTOCC GGGACTGCGG CTGGGGGCGC GCAGGACCC? 300 

CGCCTGCGTC CT6QACGTTC CCGGGGGAGG GTCC6GTTGC CA6TGAAACC TCCAGAGGCT 360 

TCCGGACGAC AGCCCC6666 OCCTTCTGAC TGCATCCCGA GATTTCCATC AGCGAGTGCA 420 

ACTCATAAG6 CAGTCCCTAA GGGGACCGGG OCACCGGCTG AGGACGG6GA TGGCTTAGGA 480 

GCTCCTGGAC CTAGGGCCC6 GCGTCGTCQC CT0CTGGGC6 TCGCGGCAGA GGGGAGT6GC 540 

CCGCGCGGAA A6CGCC6CGG GACAGTCAGT GACGAGGCCC GGGOGTCGCC GGGGCCACGA 600 

CTTCTCGGAG A006TCCTGC GCTCTCT66A GACGCGCTOT CCGCGCCCAO GGTGGTGCCA 660 

TCTGGGGCGC TCGCCGCTCG TCCGTCTCCT CATOCTGGAA GGCCGCTTCG CTCCTGCAGC 720 

TGCTGCTGGC TGCGCTGCTG GCGGCGGGGG CGAGGGCCCA GCGGCGAGTA CTGCCACGGC 780 

TGGCTGGACG CGCAGGGCGT CT66CGCATC GGCTTCCAGT GTCCCGAGCG CTTCGACGGC 840 

GGCGACGCCA CCATCTGCTG CGGCAGCTGC GCGTTGCGCT ACTGCT6CTC CAGCGCOGAQ 900 

GCGCGCCTGG ACCAGGGCGG CTGC6ACAAT GACC6CCAGC A6GGC6CTGG C6A0CCTGGC 960 

C6G606GACA AAGACGGGCC CC6ACG6CTC GGCAGGGCTT CATGTCTTAG GGGTACCCAA 1020 

GGAGACGGCG AG6GTQCGCC CCCACCCGTQ AGGGCCTGGC AGOGGTGCTC CCCTGAAGGC 1080 

TOOCCGAAAQ GAAGGCAGCT CCTCAGGGCT TTCCCGGGGC TGCTGCCCCO TGCCAGACGC 1140 

CGCGGATTCC CATCTTCTCC ACGCGGCGGC CCCTCTCCCC TGCA6CGGCC C6CCTT0CCC 1200 

ATCTACGTSC CGTTCCTCAT TGTTGGCTCC GW3TTTGTCG CCTTTATCAT CTTGGGGTCC 1260 

CTGGTGGCAQ CCTGTT6CTG CAGATGTCTC CGGCCTAAGC AGGATCCCCA GCAGAGCCGA 1320 

GCCCCAGGGG GTAACCGCTT GATGGAGACC ATCCCCATGA TCCCCAGTGC CAGCACCTCC 1380 

CGGGGGTCGT CCTCACGCCA GTCCAGCACA GCTGCCAGTT CCAGCTCCAG CGCCAACTCC 1440 

GGGGCCCGGG CGCCCOCAAC AAGGTCACAG ACCAACTGTT GCTTGCCGGA AGGGACCATG 1500 

AACAACGTGT ATGTCAACAT GCCCACGAAT TTCTCTGTGC TGAACTGTCA GCAGGCCACC 1560 

CAGATTGTGC CACATCAAGG GCAGTATCTG CATCCCCCAT ACGTGGGGTA CACGGTGCAG 1620 

CACGACTCTG TGCCCATGAC AGCTGTGCCA CCTTTCATGG ACGGCCTGCA GCCTGGCTAC 1680 

AGGCAGATTC AGTCCCOCTT CCCTCACACC AACAGTGAAC AGAAGATGTA CCCAGCGGTG 1740 

ACTGTATAAC CGAGAGTCAC TGGTGGGTTC CTTTACTGAA GGGAGACGAA GGCAGGGGTG 1800 

GATTCTCGAG GTGGAAGTCC GCACATGTCG GTGGTATTTA TGGCACGATT CCTTTGGATQ 1860 

GCTTCATTTG COCCCAGACT GTATGAAAAC ATCTCCGAAT TAGCATTTCT GGATATGTTT 1920 

CATCGAGGGT ATCATTGATT TATGATGGAA AACCGGCCTC AGCTGGAGAT GACTGTGATQ 1980 

TTGCTGATGG GTGTATAACA AATGCTTGAG TCCGAAGTGC CCTTGAGATA TGGTTGACGA 2040 

AAGAATTTTA TAAACTGATA AATTAAGGAT TTTTATTATO TTGTTATTAT TATTTCTTTT 2100 

TTGTTGTTGA CTGCACAGGA TCAAAATGCC TGTTATCTCC CTTTTACTGG GACTTTTTTT 2160 

TTTTTTTTTT TTTTTTTTAA TCAGACAGGG TCTTGCTCTG TTGCCCAGGC TGGAGTGCAG 2220 

TGGTGOGATC TCGGCTCACT GCAACTTCAG CCTCCTGGAT TCAGGCAACA CTCCTGCCTC 2280 

AGCCTCCCAC GTGGCK3GGA TEACAGGTGC CTGCCCCCAT GGCTAATTTT TTGTATTTTT 2340 

TGTAGAGATG GGGTTTCACC ATG1TGGCTG GGCTGGTCTC ACTCTCCTGA CCTCAAGCAA 2400 

TCTGCCTGTC TCAGCCTCCC AAAGTGCTGG GATTACAGGC GTGAGCCACC GCCCCCAGCC 2460 

TGAGCCTTTT TlVm - ICl ' A ATGCATCCAA GGTTAAGGGG AAGACGCAAA TAACAGGACT 2520 

ATTCTAAAAG GAAACCTGTT TGAACTCTGT GAGATCAGTC ATCAGTCTCA GTATTCCACA 2580 

GGCACACCTT AATTTCATTG TAAAAAGATA TATATATTTT GTCTATTTTT GTGCTTTTCG 2640 

GGGCCTATTT TQT QCTT TTT TACCTTATOT AGAGATCTTA TOACAAAGTG ATTTTCTACA 2700 

TTAAAAAGAG ACTGAAATAA ATTGTATAGT TACTTAACTA ATGAAGACAT TTCAGAACTC 2760 

TGGGATGATT TTAATCTTGA AGTAGTAGGT GGTATAGTGA TAAAACCATT CATCCCCTTC 2820 

TTGATTGTAT CTTAATTTTC TGGCTTTAAG GTGACATCTG AGAGGTAATG CATTCTTTTT 2880 

TATATTGAAA TCATAAACTA TCACCCGCTG CTTCTCTGAO T TA CTTT T AA TTTTGCCTTG 2940 

TGGTTATGGT TTGGCGTTTC CTTCTGTTTG GTTTTCAGAG CCCCATGTCT ATATAGTCCT 3000 

GAGTGCAAGT AATTACTATA CTTGTAAATG AAGATCAGTA TTTCTGCCTA GATCTGATAA 3060 

AAAAATTTTC TTGTCTTAGT TATAAAAATT CAAAGAAATG TGTTACAAAG ATACTTAGTA 3120 

TAGCTCCTCA GOC ATAACCT GAGACTTGGG ATGAAATTTA AACCAQATAC GATTTACTTT 3180 

GCAGATCATA AGGCTTTTTA TACTCTTGTT ATCAAAATGG CTTATTTTTC AGGCACTAAO 3240 

GATTGTTAAG AGAAAAGCTT TTCAACGAAG GATTGCCTTT CTTCTCCCAC ACTGTTCTTG 3300 

ATTTCCTCTC TCTTTCAGGC CTCAACAGGC ACTGTATTCA TTGCCAATGT TCCAAATTAT 3360 

CAAATTCAAG TGAATTTATT TGTGTGTTCT TTACTTATAT AAAAAAAGAT AACTTTAAGG 3420 

ATGTGCAAGT ACATTTCCAA CTGCTAGCAC AACCAGTATT TTGTAATTAA ACAAATCGCT 3480 

GTATGGTATG GTCTTCTACA CATTTATGTC TATAGATATC TATCGATCAT CTTTCTATTC 3540 

TGTTTCATGA CTGAATAATG TAAAACCAGT GTTGGCAATT GGTATCATCA ATGATACTCA 3600 

TTTTTTAATA ACCAAAGGCA GGGGAAAATC ATTTTACTTA TTAATAAATA TTTTATGATG 3660 
TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
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gEQ |d Nftiw BHP9 Proton wwenw 
Protein Accession*: i 



l 
I 

MLSGFLMSPS 



IX 
I 

TQHRAQYTPQ 
PPAWIRAQQQ 
CIPRFPSASA 



21 
I 

GKKLPWEASI 
PRPPPAGQAP 
THKAVPKGTG 
LL6DRPALS0 
CCWLRCWRRG RGPSGEYCHG WLDAQGVWRI 
ARLDQGGCDN DRQQGAGEP6 RADKDGPRRL 
SPKGRQLLRA FPGLLPRARR RGFPSSPRGQ 
LVAACCCRCL 
GARAFPTRSQ 
HDSVFMTAVP FFMDGLQPGY RQIQSPFPHT 



41 



OTARGGAQDP 
PPAEDGDQLO 
DALSAPKWP 
GEFQCPERPDQ 
GRASCLRGTQ 
PSPLQRPALP 
IPMIPSASTS 
PSVLNCQQAT 
NSEQKMYPAV 



RLRPGRSRGR 
APGFRARRRR 
CGALAARPSP 
GDATICCGSC 
GDGEGAPPPV 
IYVPFLIVGS 



QXVPBQGQYL 
TV 



51 
I 

AGLLWDRAAA 
VRLPVKPPEA 
LLGVAAEGSO 
HPGTPLRSCS 
ALKYCCSSAB 
RAKQHCSPEG 
VFVAFIILGS 
AASSSSSANS 
HPPYVGYTVQ 



60 
120 
180 
240 
300 
360 
420 
480 
S40 



Nudeic Arid Accession* 



SEQ 10 NO: 196 CQA5 DMA SEQUENCE 



Coding sequence: 



862-1 995 (undefined sequences correspond to stert and stop codoos) 



GCCCTTGGAC 
CTGAAGAAAA 
GCGCGGGGCC 
CTGGGCCAGA 
CGGCTACTGC 
TGTGCCAGCC 
ACCTCACCCC 
CTCACCCAGG 
GCGCTCATTA 
GATTCCACCT 
AGCCCTTCGA 
GCCCAGGCAC 
GCCTGCCCCC 
ACATGGGCTO 



GGTCCCATCT 
AGAGG6CGCG 
CAGGACGAGG 
GTAAGCGGGG 
CTGGCCAAGG 
GGCCTGCATG 
TTGCCCCACG 
GACAGCTCCC 
CTGGGGTCCT 
GAGAGGCCAC 
GGCAGGTCCC 
GGAGTACGCA 
GAACCAGGGG 
TCAGTGTGTG 
CCCGATGCGG 
ACACTGTCCC 
CCTTCCGGAG 
TGCTGCACCT 
GCCCTCCTAC 
ACCTCCTGGG 
AGGTGGACTG 
GGCTGGGGTC 
TGGGGGATCC 
GGTGACTTCA 
GAGACAGGCT 
AAAGAAATAG 
CACGAGGGGA 
GCAGAOCCTG 
GAGCAGCGTC 
GCGTGCACAC 
CAGAAGTGTC 

CTGGAATCCC 
CCOCATCTCT 
TCATAAACAC 
TAGACCCAGA 
AGAAATAAAA 



11 
I 

ACTGACATGG 
AGGAGCTGGA 
GCGACTGGTA 
GCAGAGCCAG 
CCAAGGTACA 
GGGCCCTGCC 
CGGTCTGGCA 
AGGTGACCGA 
AGCAGCTGTT 
TCATCTAGTC 
G GGTGG GCGC 
AGTCCCGGAG 
GGCTGGTCCC 
GGGGCTCTCT 
GGTACCCCTC 
TCAGGGAAAG 
GGGCGGCTCC 
TGGCTGTAGC 
GGTGCCTGCC 
CTGAGGGACC 
TGCCTCCCAC 
TTGAGTCCCA 
AGGCACGTCA 
GCTCACCCCC 
CTCCCTCAGC 
CTTGGGTGTC 

CACGGCAACA 
TGGGGCGCAG 
GGTCAGTGCG 
ACAAGGCACC 
CCCAGCTCCA 
GGTCTGCAGG 
CCTGAAGATG 
CAGGAAAGGG 
CAGCGCAGTG 
TGCCCACCAG 
TGGCATCTTT 
TCAGGAGACC 
GGCACCTCCG 
GTCCTCCCAG 
GAATTTAAAG 
CCTGGAGCCT 
CCTGGGCTCT 
TGTGATGACA 
CCCAGTTGAG 
ATCAAGTTCC 
AGCACTTGAG 
ACAARAAAAA 
CACAAGGAAA 
TACTAGAATT 
GAGATTTCTG 



21 
I 

ACTGAAGGAG 
GCAGGAGAAG 
CCAGCAGCAG 
GGCGGACTTT 
AGAGGTGGCC 
CCCGTCCTCC 
GCAGCAGACC 
GAAGAGTGAG 



31 41 51 

I I 1 

TAGAATGGAG CACGAGGACA CTGACATGGA 
GAGGTGCTGC TGCAGGGTTT GGAGATGATG 
CTGCAACGAG TGCAGGAGCG CCAGCGCCGC 
GGGGCTGCAG GGAGCCCCCG CCCACTGGGG 



CCCATCGCAC 
TGGGCGCCTT 
CGCACCGAGC 
TGAGTCCGCA 
CATGAGTTAG 



GACGCGGGTC 
TCGGACGGAC 
TGGCTGGGGA 
CTGGCTGCAG 
AGACCCTGGG 
CACAACATCC 
TAGGCAAAGC 
CTTTGCTCTC 
CAAGGAAAAC 
ACTOCCTCAG 

GCATCGATGG 
GGCCTCCGAT 
TGGGGGGCGC 
TGTCTCAGAG 
TGCTAACCTG 
GGTGTCCCAG 
GGAGTGGGCT 
TGCAGGTCCT 
GGTGGGCCAG 
GGCCTCCCCA 
ACTGGACTGG 
GCCCACATRG 
GAAAAACTGC 
TTTACAGCTT 
GCCCOGGCTG 
GCCCTAGGAC 
ATCCGCGAGG 
CCCGGAAATG 
AATCTGCOCC 
AAGGAAAAGG 



AAAAAGAAAG 
CAATACACTA 
ATCAGAGAGA 
GAAACATGAA 



TCCGGGCCCC 
ATCCTCATGC 
CGCATCACGC 
GCCCTGAGCC 
GCGTGGGCCC 
CCACCCTCTC 
CCTGCCGCCC 
GCTTGACTCC 
TAGTCCGCAG 
CGTCCCCCOG 
CGCCAGGCTG 
CAAGGGCAGC 
GGAAGTAGAT 
GOCCCAGGGA 
CGGATCGGCA 
GTGATGGCCT 
TGTGAGCCTG 
CTGTTTCCCC 
ACGCCCAGCC 
GAGAACCCCC 



AGCCCAACCT 
GTTCTGCAGC 
GCGGGGTCAG 
AGGGCCCCCT 
GAGGGGCCCT 
CCCACAGCAA 
GACAGGCCCA 
TTCCAGGGGA 
GAGGGCCTGT 
TGGCAGCCAG 

cqw c m c c i t 
aagcaggaga 

AGCTGGACCC 
CTTTCAGCCT 
GAAATCAGGC 
GCAGGGTCTA 
GCTGGGCGGG 
TGCCAGTAGC 
TCTCAGGATG 
AGAGGAACAC 
AACATCTCAG 
CCAGAGCAGC 
AAAGAAAATG 
TGAGACCCAG 
ATATAAAGTA 



CCTGOCCTGC 
TGAAGGAGCA 
AGCTGGAGCA 
AGCAGGACGG 
CCAGGGCCAG 
TGGCTGGAGA 
TTGCCAGATG 
GTTTKGGCTC 
CTACTACTGG 
TTTCCAGCGG 
CACTTCCAAC 
TTCCCGCTCA 
GGAGGGGGTG 
TAGCGGTCGG 
CGGCGGGTGG 
TCCOCCTCTT 
GCTCCCCAGG 
CGACTCAGGA 
TGTOCCCAGG 
AGGGTACAGG 
GGCCCACTCC 
GGAGGGTCCC 
CCAGGGCCOC 
TGCGTGGGGG 
CGTGTCCAGG 
GGCAGGCAGC 
CCCCACAGAG 
AGTCAGCCCA 
CATAAGGATG 
GCCCCACAGC 
GGAGAAGCCC 
TGAGGGTGCC 
CAGAACAGTG 
CGCAGCTGAA 
TGGTGTTCCG 
TAGTGAGTGG 
GGTGGCTGGC 
TCAGTCTCCG 
GTGTGCAGGT 
TTGAAATGTG 
ACCCACAOCA 
CCGGGCGTGG 
CTGGGCAACG 
AGAGATCCAG 
CAGAAGCAAC 
ACAGTGTTTP 



CCTGACGTCC 
GAACCGACTC 
GGAGAAGTCG 
GGGACCTCTG 
CCTGGCACTC 
CCCCCGGCAG 
GGCTCCCCAO 
CTGGTTGYTG 
CCGCTGTCAG 
TGCCGCCCTG 
AACGGGCAGC 
ACCAGGGCAC 
GGGAOGGCCT 
ACTTGAGGTT 
GCGAGAGCTT 
GGCCGGGACG 
AGGGCCCCCA 
TTTCCAAGGC 
TTTCAGCTGG 
AGGAGGCTGG 

AGTGTCACCA 
CGATGCGGGG 
GCGCAGGGCC 
GCACTTTGGT 
GTGGCAACTC 
CCACATTCCC 
GCATGCAGCT 
TCAGGCCTGG 
CCCAGCACCC 
CCCGTCAGCA 
TGCCATGCCC 
TCTGTCCCGG 
GCGGAAATGT 
TGCAAGGTGA 
CCCTGGAGAC 
AGAGGCACAT 
TGCAGGATGT 
ACATACACOT 
TCCTTGGGGG 
GGCCTCAGGA 
TGGTTCACGC 
CAGTGAGAGA 
GTTTAAAAAT 
AGATTGACTC 
ATATATCTAA 



60 
120 
160 
240 
300 
360 
420 
460 
540 
600 
660 
720 
760 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2S20 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
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SEQ ID MO:197 LBQ2 DMA SEQUENCE 

Nucfefc Add Accession* X63S29 

Coding soqusncGC 54-2543 {start and stop codons&reunderfined) 

I 11 21 31 41 51 

I I ! I I I 

GCGGAACACC GGCOCGOOGT CQCQGCAGCT GCTTCACOCC TCTCTCTGCA QCCAJgGGGC 60 
TCCCTCOTGG ACCTCTCGCG TCTCTCCTCC TTCTCCAGGT TTOCTGGCTO CAGTGCGCGG 120 
CCTCCGAGCC GTGCCGGGCG GTCTTCAGGO AGGCTG AAGT OACCTTGG AG GCGGGAGGCG 180 
CGG AGCAGGA GCOOGGOCAG GCGCTGGGGA AAGTATTCAT GGGCTGCCCT GGGCAAG AGC 240 
CAGCTCTGTT TAGCACTGAT AATGATGACT TCACTGTGCG GAATGGOGAG ACAGTCCAGG 300 
AAAGAAGGTC ACTG AAGGAA AGG AATCCAT TGAAGATCTT CCCATCCAAA CGTATCTTAC 360 
GAAG ACACAA GAGAGATTGG GTGGTTGCTC CAATATCTGT CCCTGAAAAT GGCAAGGGTC 420 
CCTTCCOCCA GAGACTOAAT CAGCTCAAGT CTAATAAAG A TAGAGACACC AAGATTTTCT 480 
ACAGCATCAC GGGGOOGGGG GCAGACAGOC COOCTGAGGG TGTCTTOGCT GTAGAGAAGG 540 
AGACAGGCTG GTTGTTGTTG AATAAGCCAC TGGACCGGGA GGAGATTGCC AAGTATGAGC 600 
TCTTTGGCCA CX3CTGTGTCA GAGAATGGTG CCTCAGTGGA GGAGCCCATG AACATCTCCA 660 
TCATCGTGAC CG ACCAGAAT GACCACAAGC OCAAGTTTAC OCAGGACACC TTCCGAGGGA 720 
GTGTCTTAG A GGGAGTOCTA CCAGGTACTT CTGTG ATGCA GGTG ACAGCC ACAG ATGAGG 780 
ATGATGCCAT CTACAOCTAC AATGGGGTGG TTGCTTACTC CATCCATAGC CAAGAACCAA 840 
AGGACCCACA CGACCTCATG TTCACAATTC ACCGG AGCAC AGGCACCATC AGCGTCATCT 900 
CCAGTGGOCT GGAOGGGGAA AAAGTOOCTG AGT ACACACT GAOCATCCAG GCCACAGACA 960 
TGGATGGGGA CGGCTOCACC AOCAOGGCAG TGGCAGTAGT GGAGATCCTT GATGCCAATG 1020 
ACAATGCTCC CATGTTTGAC CCCCAGAAGT ACGAGGCOCA TGTGCCTGAG AATGCAGTGG 1080 
GCCATGAGGT GCAGAGGCTO ACGGTCACTG ATCTGGACGC CCCCAACTCA CCAGCGTGGC 1140 
GTGCCACCTA CCTTATCATG GGCGGTGACO ACGGGGACCA TTTTAOCATC ACCACCCACC 1200 
CTGAGAGCAA CCAGGGCATC CTG ACAACCA GGAAGGGTTT GGATTTTGAG GCCAAAAACC 1260 
AGCACACCCT GTACGTTGAA GTG ACCAACG AGGCCCCTTT TGTGCTG AAG CTCCCAACCT 1320 
CCACAGCCAC CATAGTGGTC CACGTGGAGG ATGTGAATGA GGCACCTGTG TTTGTCCCAC 1380 
CCTCCAAAGT CGTTGAGGTC CAGGAGGGCA TCCOCACTGG GGAGCCTGTG TGTGTCTACA 1440 
CTGCAG AAG A CCCTG ACAAG G AGAATCAAA AG ATCAGCTA CCGCATCCTG AG AG ACOCAG 1500 
CAGGGTGGCT AGCCATGGAC CCAGACAGTG GGCAGGTCAC AGCTGTQGGC ACCCTCGACC 1560 
GTG AGGATGA GCAGTTTGTG AGG AACAACA TCTATGAAGT CATGGTCTTG GCCATGG ACA 1620 
ATGGAAGCCC TCCCACCACT GGCACGGGAA CCCTTCTGCT AACACTGATT GATGTCAACG 1680 
ACCATGGCOC AGTCCCTGAG CCCCGTCAGA TCACCATCTG CAACCAAAGC CCTGTGCGCC 1740 
ACGTGCTGAA CATCACGG AC AAGGACCTGT CTCCOCACAC CTCXXCTTTC CAGGCCCAGC 1800 
TCACAGATGA CTCAGACATC TACTGG ACGG CAGAGGTCAA CGAGGAAGGT GACACAGTGG 1860 
ll'l imVCCT GAAGAAGTTC CTGAAGCAGG ATACATATGA CGTGCACCTT TCTCTGTCTO 1920 
ACCATGOCAA CAAAGAGCAG CTGACGGTGA TCAGGGCCAC TCTCTQCGAC TGCCATGGCC 1980 
ATGTCGAAAC CTGCCCTGG A CCCTGGAAAG GAGGTTTCAT CCTCCCTGTG CTGGGGGCTG 2040 
TCCTGGCTCT GCTO'l 1CC1C CTGCTGGTGC TG CllTl VIT GGTGAGAAAG AAGCGGAAGA 2100 
TCAAGGAGCC CCICCTACTC CCAGAAGATG ACACCCGTGA CAACGTCTTC TACTATGGCG 2160 
AAGAGGGGGG TGGCGAAGAG GACCAGGACT ATGACATCAC CCAGCTCCAC CGAGGTCTGG 2220 
AGGCCAGGCC GGAGGTGGTT CTOOGCAATG ACGTGGCACC AACCATCATC COGACACCCA 2280 
TGTACCGTCC TAGGCCAGCC AAOCCAG ATG AAATOGGCAA CTTTATAATT GAGAACCTGA 2340 
AGGCGGCTAA CACAGACCCC ACAGCCCCGC CCTACG ACAC CCTCTTGGTG TTCG ACTATG 2400 
AGGGCAGCGG CTCCGACGCC GCGTCCCTGA GCTCCCTCAC CTCCTCCGCC TCCGACCAAG 2460 
ACCAAGATTA CGATTATCTG AACGAGTGGG GCAGCCGCTT CAAGAAGCTG GCAGACATGT 2520 
ACGGTGGCGG GGAGGAOGAC lAQGCGGCCT GCCTGCAGGG CTGGGGACCA AACGTCAGGC 2580 
CACAGAGCAT CTCCAAGGGG TCTCAGTTCC CCdTCAGCT G AGGACTTCG GAGCTTGTCA 2640 
GGAAGTGGCC GTAGCAACTT GGCGGAGACA GGCTATGAGT CTG ACGTT AG AGTGGTTGCT 2700 
TCCTTAGCCT TTCAGGATGG AGG AATGTGG GCAGTTTG AC TTCAGCACTG AAAACCTCTC 2760 
CACCTGGGCC AGGGTTGCCT CAG AGGCCAA GTTTCCAGAA GCCTCTTACC TGCCGTAAAA 2820 
TGCTCAACCC TGTGTCCTGG GCCTGGGCCT GCTGTGACTG ACCTACAGTG GACITTCTCT 2880 
CTGGAATGG A ACCTTCTTAG GCCTCCTGGT GCAACTTAAT TTTTTTTTTT AATGCTATCT 2940 
TCAAAACGTT AGAGAAAGTT CTTCAAAAGT GCAGCCCAG A GCTGCTGGGC CCACTGGCCG 3000 
TCCTGCATTT CTGGTTTCCA GACCCCAATG CCTCCCATTC GGATGGATCT CTGCGTTTTT 3060 
ATACTGAGTG TGCCTAGGTT GCCCCTTATT TTTTATTTTC CCTGTTGCGT TGCTATAGAT 3120 
G AAGGGTG AG GACAATCGTG TATATGTACT AGAACTTTTT TATTAAAGAA A 



Protein Accession*: CAA45177 



1 11 21 31 41 51 

MGLPRGPLAS lliljQVCWLQ CAASEPCRAV FREAEVTLEA GGAEQEPGQA LGKYFMGCPG 60 
QEPALFSTDN DDFTVRNGET VQERRSLKER NPLKIFPSKR ILRRHKRDWV VAPISVPENQ 120 
KGPFPQRLNQ LKSNKDRDTK IFYSITGPGA DSPPEGVFAV EKETGWLLLN KPLDREHAK 180 
YBLPGHAVSB NO AS VEDPMN BITVTDQND HKPKFTQDTF RGSVLEGVLP GTSVMQVTAT 240 
DEDDAIYTVN GWAYSIHSQ EPKDPHDLMFTIHRSTGTIS VBSGLDREK VPEYTLTIQA 300 
TDMDGDGSTT TAVAWEILD ANDNAPMFDP QKYEAHVPEN AVGHBVQRLT VTDLDAPNSP 360 
AWRATYUMG GDDGDHFTTT THPESNQGIL TTRKGLDFEA KNQHTLYVEV TNEAPFVLKL 420 
PTSTATIWH VEDVNEAPVF VPPSKWEVQ EGIPTGEPVC VYTAEDPDKB NQKISYRILR 480 
DPAGWLAMDP DSGQVTAVGT LDREDEQFVRNKIYEVMVLA MDNGSPPTTG TGTLLLTLID 540 
VNDHGPVPEP RQITICNQSP VRHVLNITDK DLSPHTSPRJ AQLTDDSDIY WTABVNEEGD 600 
TWLSLKKFL KQDTYDVHLS LSDHGNKBQL TVIRATVCDC HGHVETCPGP WKGGFILPVL 660 
GAVLALLFLL LVLLLLVRKK RKIKEPLLLP EDDTRDNVFY YGEEGGGEED QDYDITQLHR 720 

384 
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GLEARPEWL RNDVAPTHP TPMYRPRPAN PDEIGNFIIE NLKAANTDPT APPYDTLLVF 780 
DYEGSGSDAA SLSSLTSS AS DQDQDYDYLN BWGSRFKKLA DMYGGGEDD 

SEQ ID N&199 0BI5 DNA SEQUENCE 

Nuctefc Add Accession #: NM_012152 

Coding sequence: 43-1104 {undeifined setpjences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CTTCTTTAAA TTTCTTTCTA GGATGTTCAC TTCTTCTCCA CAATGAATGA GTGTCACTAT 60 

GACAAGCACA TGGACTTTTT TTATAATAGG AGCAACACTG ATACTGTCGA TGACTGGACA 120 

GGAA CAAAGC TTGTGATTGT T TT GTOT g CT GGGACGTTTT TCTGCCTGTT TATTTTTTTT 180 

TCTAATTCTC TGGTCATCGC GGCAGTQATC AAAAACAGAA AATTTCATTT CCCCTTCTAC 240 

TACCTCTTGG CTAATTTAGC TGCTGCCGAT TTCTTCGCTG GAATTGCCTA TGTATTOCTO 300 

ATGTTTAACA CAGGCOCAGT TTCAAAAACT TTGACTGTCA ACCGCTCGTT TCTCCGTCAG 360 

GGGCTTCTGG ACAGTAGCTT GACTGCTTCC CTCACCAACT TGCTGGTTAT CGCCCTGGAG 420 

AOQCACATOT CAATCATGAG GATGCGGOTC CATAGCAACC TGACCAAAAA GAGGGTGACA 480 

CTGCTCATTT TGCTTGTCTG GGCCATCGCC ATTTTTATGG GGGC66TCCC CACACT6G0C 540 

TGGAATTGCC TCTGCAACAT CTCTGCCTGC TCTTCCCTOG CCCCCATTTA CAGCAOGAQT 600 

TACCTTOTTT TCT6GACAGT GTCCAACCTC ATGGCCTTCC TCATCATOGT TGTQOTQTAC 660 

CTGCGGATCT ACGTCEACGT CAAOAGOAAA ACCAACGTCT TGTCTCCGCA TACAAGTGGG 720 

TCCATCAGCC GCCGGAGGAC ACCCATGAAG CTAATGAAGA CGGTGATGAC TGTCTTAGGG 780 

GCGTTTGTGG TAT6CT6GAC CCCGGGCCTG GT6GTTCTGC TCCTCGACGG CCTGAACTGC 840 

A6GCAGTGTG GCQTGCAGCA TGTGAAAAGG TGGTTCCTQC TGCTGGCGCT GCTCAACTCC 900 

QTCQTGAACC CCATCATCTA CTCCTACAAG GACGAGGACA TGTATG6CAC CATGAA6AA0 960 

ATQATCTGCT GCTTCTCTC A GGAGAACCCA GAGAGGCGTC CCTCTCGCAT CCCCTCCACA 1020 

6TCCTCA6CA GGAGTGACAC AGGCAGCCAG TACATAGAOG ATAGTATTAQ CCAAGGTGCA 1080 

GTCTGCAATA AAA6CACTTC CTAAA CTCTG GATGCCTCTC GGCCCACCCA GGTQATGACT 1140 
GTCTTAGG 



fijQ ID HQgpp Qpis Projem wquence; 

Protein Accession* NPJB6284 

1 11 21 31 41 51 

111)11 

MNECHYDKHM DFFYNRSNTD TVDDWTGTKL VXVLCVGTFP CLPIPPSNSL VTAAVIKNRK 60 
FHPPPYYLLA NLAAADFFAG IAYVFLMFNT GPVSKTLTVN RWPLRQGLLD SSLTASLTNL 120 
LVXAVBRHHS IHRMKVHSNL TKKKVTLLIL LVWAIAIFKG AVPTLGWNCL CNI SACS SLA 180 
PIYSRSYLVF WTVSNLMAFL HSVWYLRIY VYVKRKTNVL SPHTSGSISR RRTPMKLMKT 240 
VKTVLGAFW CWTPGLWLL LDGLNCRQCG VQHVKKWPLL LALLNSWNP IIYSYKDBDM 300 
YGTMKKMICC FSQENPERRP SRIPSTVLSR SDTGSQYBED SISQO A VCNK STS 

SEQ ID N&201 PAA6 DNA SEQUENCE 

Coring sequence: 1-504 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGA CCTACA GTTACTCATT TTTCAGGCCT GAGTTGATCG TTAATCATCT TAATTATGTT 60 

CATTCTGAAG CCAACAGGAG AACCAAGAOC AAAACTTTA T TGTCTCTG CT TTCATTTCTT 120 

GATGAAAOCT CTGGACTAAG CACACATCTP CCTTGTTTAT CTCTCTCAAA GGAGTGTGGA 180 

GT G CTTC A TC TGGACATCCA CGGGAAGAAG GAAGACATQA GAATCACCCA ACAGTCTTCC 240 

CAGCTATACC TGTGGGACAT GGQTQG TTTT ACAATATTTA AGAACCTGTG GATGAGCCTC 300 

AZACCCAGAG GGA ACAAA CG CTCCCCAAAA AGAGTTACAG AAACCAICCT OAGAGATTTT 360 

AAGCAGAAGC AAAGTTCAAA GATCCAAGAG GAGAGACGAA GAGAGTCTGC AGGACCAAAC 420 

CTCTCTTCAT TCTGGTTTGT GGGGAATGCT GGAAGAGGAG ACAGGCCCCA GATTTGGGCA 480 

GGAAGTAAAC AGTTTTCAGG CTGAGGCCAA TCTGAGCAGG AACATTCCAA TATTTCTTCA 540 

GCTACGTTGT CCCAGCACTT CACTGGTFAA CCTTTTATGT GCACCATTTG TGGATTTCAC 600 

A GCTA C TT GT CAATGGTGAA TATTGATCAT CATCATTATC TACTGAGCTG CTACCATATG 660 

CCAGCTACTC CTTGCATGTT GTTCATTATT TTCTCAACAC TCAGCATATT TGCAATATGT '720 

TATGTAATAT CACAGACAAG GAAACTGAAC GCAGAAATGT TTTATTTCTT GCCAAACATC 780 

ACATGAGGAT GAACAATGAA ACCGATTTGA AACCAGGATT GTCTGATTCC AACATCTCTG 840 
GGTCCTTTTT CACTCTGATA TGCTGCAATT AAAAAGCCAT TTCTAAGACT GT 



SEQ ID KOt202 PAA6 Protein secuencBt 
Protein Accession #: none found 

1 11 21 31 41 51 

I I I I I I 

KTYSYSFFRP ELIVUHLNYV HSEAKRRTKT KTLLSLLSFL DETSGLSTHL PCLSLSKECG 60 
VLHLDZHGKK EDMRITQQSS QLYLWDHGGF TIFKNLWKSL IPRGNKRSPK RVTETILRDP 120 
KQKQSSK3QH ERRRESAGPN LSSFWFVGNA GRGDRPQIWA GSKQFSG 
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SEQ ID N&203 PAB2 DNA SEQUENCE 

NudeJcAddAccesstoift XM.050197 

Cocfing sequence: 31 0-1 971 (underlined sequences correspond to dart and stop codons) 

1 " 11 21 31 41 51 

I I I I I I 

TCACACGTGC CAAGGGGCTG GCTCA6CGGA ACCAGCCTOC AOOCOCTGGC TCCGGGTGAC 60 

AGCCGOQCGC CTCGGCCAGG ATCTGAGTGA TGAGACGTGT CCCCACTGAG GTGCCCCACA 120 

GCAGCAGGTG TTGAGCATGG GCTGAGAAGC TGGACCGGCA CCAAAGGGCT GGCAGAAATG 180 

GGCGCCTGGC TGATTCCTAG GCAGTTGGCG GCAGCAAGGA GGAGAGGCCG CA6CTTCTGG 240 

AGCAGAGCCG AGACGAAGCA GTTCTGGAQT GCCTGAACGG CCCCCTGAGC CCTACCCGCC 300 

TGGCCCACTA TOG TCCAGAO GCTGTGGGTG AGCCGCCTGC TGCGGCACCG GAAAGCCCAG 360 

CTCTTGCTCG TCAACCTGCT AACCTTTGGC CTGGAGGTGT GTTTGGC C GC AGGCATCACC 420 

TATGTGCCGC CTCTGCTGCT GGAAGTGGGG GTAGAGGAGA AGTTCATGAC CATGGTGCTG 480 

OGCATTGGTC CAGTGCTGGG CCTGGTCTGT GTCCCGCTCC TAGGCTCAGC CAGTGACCAC 540 

TGGCGTGGAC GCTATQGCCG CCGCCGGCCC TTCATCTGGG CACTGTCCTT GGGCATCCTG 600 

CTGAGCCTCT TTCTCATCCC AAGGGCCGGC TGGCTAGCAG GGCTCCTGTG CCCGGATCCC 660 

AGCCCOCTGG AGCTGGCACT GCTCATCCTG GGCGTGGGGC TGCTGGACTT CTGTGGCCAG 720 

GTGTGCTTCA CTCCACTGGA 6GCCCTQCTC TCTGACCTCT TCCGGGACCC GGACCACTGT 780 

CGCCAGGCCT ACTCTGTCTA TGCCTTCATG ATCAGTCTTG GGGGCTGCCT GGGCTACCTC 840 

CTGCCTGCCA TTGACTGGGA CACCAGTGCC CTGGCCCCCT ACCTGGGCAC CCAGGAGGAG 900 

TGCCTCTTTG GCCTGCTCAC CCTCATCTTC CTCACCTGCG TAGCAGCCAC ACTGCTGGTO 960 

GCTGAGGAGG CAGCGCTGGG CCCCACCQAO CCAGCAGAAG GOCTOTCGGC CCCCTOCTTG 1020 

TCGCCCCACT GCTGTCCATG CC GGGC COGC TTGGCTTTCC GGAACCTGOG CGCCCTGCTT 1080 

CCCCGGCTGC ACCAGCTGTG CTGCCGCATG CCCCGCACCC TGCGCCG6CT CTTCGT6GCT 1140 

GAGCTGTGCA GCTGGATGGC ACTCATGACC TTCACGCTGT TTTACACGGA TTTCGTGGGC 1200 

GAGGGGCTGT ACCAGGGCGT GCCCAGAGCT 6AG0C6GGCA CCOAGGCCCG GAGACACTAT 1260 

GATGAAGGCG TTCGGATGGG CAGCCTGGGG CTGTTCCTGC AGTGCGCCAT CTCCCTGGTC 1320 

TTCTCTCTGG TCATGGACCG GCTGGTGCAG CGATTCGGCA CTCGAGCAGT CTATTTGGCC 1380 

AGTGTGGCAG CTTTCCCTGT GGCTGCCGGT GCCACATGCC TOTCCCACAG TGTGGCCGTG 1440 

GTGACAGCTT CAGCCGCCCT CACCGGGTTC ACCTTCTCAG CCCTGCAGAT CCTGCCCTAC 1500 

ACACTGGCCT CCCTCTACCA CCGGGAGAAG CAGGTGTTCC TGCCCAAATA CCGAGGGGAC 1560 

ACTGGAGGTG CTAGCAGTGA GGACAGCCTG ATGACCAGCT TCCTGCCAGG CCCTAAGCCT 1620 

GGAGCTCCCT TCCCTAATGG ACACGTGGGT GCTGGAGGCA GTGGCCTGCT CCCAOCTCCA 1680 

CCCGCGCTCT GCGGGGCCTC TGCCTSTGAT GTCTCCGTAC GTGTGGTGGT GGGTGAGCCC 1740 

AOCGAGGCCA GGGTGGTTCC GGGCCGGGGC ATCTGCCTGG ACCTCGCCAT CCTGGATAGT 1800 

GCCTTCCTGC TGTCCCAGGT GGCCCCATCC CTGTTTATGG GCTCCATTGT CCAGCTCAGC 1860 

CAGTCTGTCA CTGCCTATAT GGTGTCTGCC GCAGGCCTGG GTCTGGTCGC CATTTACTTT 1920 

GCTACACAGG TAGTATTTGA CAAGAGCGAC TTGGCCAAAT ACTCAGC GTA GA AAACTTCC 1980 

AGCACATTGG GGTGGAGGGC CTGCCTCACT GGGTCCCAGC TCCCCGCTCC TGTTAGCCCC 2040 

ATGGGGCTGC CGGGCTGGCC GCCAGTTTCT GTTGCTGCCA AAGTAATGTG GCTCTCTGCT 2100 

GCCACCCTGT GCTGCTGAGG TGCGTAGCTG CACAGCTGGG GGCTGGGGCG TCCCTCTCCT 2160 

CTCTCCCCAG TCTCTAGGGC TGCCTGACTG GAGGCCTTCC A&GGGGGTTT CAGTCTGGAC 2220 

TTATACAGGG AGGCCAGAAG GGCTCCATGC ACTGGAATGC GGGGACTCTG CAGGTGGATT 2280 

ACCCAGGCTC AGGGTTAACA GCTAGCCTCC TAGTTGAGAC ACACCTAGAG AAGGGTTTTT 2340 

GGGAGCTGAA TAAACTCAGT CACCTGGTTT CCCATCTCTA AGCCCCTTAA CCTGCAGCTT 2400 

CGTTTAATGT AGCTCTTGCA TGGGAGTTTC TAGGATGAAA CACTCCTCCA TGGGATTTGA 2460 

ACATATGAAA GTTAT7IGTA GGGGAAGAGT CCTGAGGGGC AACACACAAG AACCAGGTCC 2520 

CCTCAGCCCC ACAGGCACTG gTCCTTTTTO CTNGANTCCA CCCCCOCCCT CTTTACCCTT 2580 
TT 



SEP to NP3Q4 pak> PrpWn wwgngg; 

Protein Accession #: XP.090197 

1 11 21 31 41 51 

I I I I I I 

HVQRLWVSRL LRHRKAQLLL VNLLTFGLEV CLAAGITYVP PLLLEVGVEE KFMTMVLGIG 60 
PVLGLVCVPL LGSASSHWRG RYGRRRPFIW ALSLGILLSL FLIPRAGWLA GLLCPDPRPL 120 
BLALLILGVG LLDFCGQVCF TPLBALLSDL FRDPDHCRQA YSVYAFMISL GGCLGYLLPA 180 
IDWDTSALAP YLGTQEECLP GLLTLIFLTC VAATLLVAEE AALGPTEPAB GLSAPSLSPH 240 
CCPCRARLAF RNLGALLPRL HQLCCRMPRT LRRLFVAELC SWMAUfTFTL FYTDFVGEGL 300 
YQGVPRAEPG TEARRHYDEG VRKGSLGLPL QCAISLVFSL VMDRLVQRFG TRAVYLASVA 360 
APPVAAGATC LSHSVAWTA SAALTGFTFS ALQILPYTLA SLYHREKQVF LPKYRGDTGG 420 
ASSEDSLMTS FLPGPKPGAP FPNGHVGAGG SGLLFPFFAL CGASACDVSV RWVGEPTBA 480 
KWPGRGICL DLAILDSAFL LSQVAPSLFH GSIVQLSQSV TAYHVSAAGL GLVAIYFATQ 540 
WFDKSDLAK YSA 

SEQ ID N02Q5 PAJ3 DNA SEQUENCE 

Nuctelc Add Accession #: AK002128 

Coding sequence: 1-1 533 (underiined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I III! I 

Ai m yiT CGO C GGGGGCTOCT TGCGTGGATT TCCCGGGTGG ' J UI MT IIGL'T GGTGCT C CTC 60 

TGCTGTGCTA TCTCTGTCCT GTACATGTTG GGCTGCACCC CAAAAGGTGA CGAGGAGCAG 120 

CT GG C A CTGC CCAGGGCCAA CAGCCCCACG GGGAAGGAGG GGTACCAGGC CGTCCTTCAO 180 

GAGTGGGAGG AGCAGCAOOG CAACTACGTG AGCAGCCTGA AGCGGCAGAT OGCACAGCTC 240 
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Protein Accession!: NPJJ60841 



AAGGAGGAGC TGCAGGAGAG GAGTGAGCAG CTCAGGAATQ GGCAGTACCA AGCCAGCGAT 300 

GCTGCTGGCC TGGGTCTGGA CAGGAGCCCC CCAGAGAAAA CCCAGGCCGA CCTCCTGGCC 360 

TTCCT6CACT CGCAGGTGGA CAAGGCAGAG GTGAATGCTG GCGTCAAGCT GGCCACAGAG 420 

TATGCAGCAG TGCCTTTCGA TAGCTTTACT CTACAGAAGG TGTACCAGCT GGAGACTGGC 4B0 

CTTACCCGCC ACCCCGAGGA GAAGCCTGTG AGGAAGGACA AGCGGGATGA GTTGGTGGAA 540 

GCCATTGAAT CAGCCTTGGA GACCCTGAAC AATCCTGCAG AGAACA6CCC CAATCACCGT 600 

CCTTACACGG CCTCTGATTT CATAGAAGGG ATCTACCGAA CAGAAAGGGA CAAAGGGACA 660 

TTGTATGAGC TCACCTTCAA AGGGGACCAC AAACACGAAT TCAAACGGCT CATCTTATTT 720 

1 CGACCATTCG GCCCCATCAT GAAAGTGAAA AATGAAAAGC TCAACATGGC CAACACGCTT 780 

10 ATCAATGTTA TCGTGCCTCT AGCAAAAAGG GTGGACAAGT TCCGGCAGTT CATGCAGAAT 840 

TTCAGGGAGA TGTGCATTGA GCAGGATGGG AGAGTCCATC TCACTGTTGT TTACTTTGGG 900 

AAAGAAGAAA TAAATGAAGT CAAAGGAATA CTTGAAAACA CTTCCAAAGC TGCCAACTTC 960 

AGGAACTTTA CCTTCATCCA GCTGAATGGA GAATTTTCTC GGGGAAAGGG ACTTGATGTT 1020 

GGAGCCCGCT TCTGGAAGGG AAGCAACGTC CTTCTCTTTT TCTGTGATGT GGACATCTAC 1080 

TTCACATCTG AATTCCTCAA TACGTGTAGG CTGAATACAC AGCCAGGGAA GAAGGTATTT 1140 

TATCCAGTTC TTTTCAGTCA GTACAATCCT GGCATAAIAT ACGGCCACCA TGATGCAGTC 1200 

CCTCCCTTGG AACAGCAGCT GGTCATAAAG AAGGAAACTG GATTTTGGAG AGACTTTGGA 1260 

TTTGGGA7GA CGTGTCAGTA TCGGTCAGAC TTCATCAATA TAGGTGGGTT TGATCTGGAC 1320 

ATCAAAGGCT GGGGCGG&GA GGATGTGCAC CTTTATCGCA AGTATCTCCA CAGCAACCTC 1380 

ATAGTGGTAC GGACGCCTGT GCGAGGACTC TTCCACCTCT GGCATGAGAA GCGCTGCATG 1440 

GACGAGCTGA CCCCCGAGCA GTACAAGATG TGCATGCAGT CCAAGGCCAT GAACGAGGCA 1500 

TCCCACGGCC AGCTGGGCAT GCTGGTGTTC AGGCACGAGA TAGAGGCTCA CCTTCGCAAA 1560 
CAGAAACAGA AGACAAGTAG CAAAAAAACA TGA 



„ 1 11 21 31 41 51 

30 | | | II | 

MVRRGLLAWI SRVWLLVLL CCAISVLYML ACTPKGDEEQ LALPRANSPT GKEGYQAVLQ 60 

EWEEQHRNYV SSLKRQIAQL KKELQBRSBQ LRNGQYQASD AAGLGLDRSP PEKTQADLLA 120 

FLHSQVDKAE VNAGVKLATE YAAVPFDSFT LQKVYQLETG LTRHPEEKFV RKDKKDELVE 180 

AIESALETLN NPAENSPNHR PYTASDFIEG IYRTERDKGT LYELTFKGDH KHEFKRLILF 240 

RPFGPIMKVK NEKLNMANTL INVTVPLAKR VDKFRQFMQN FREMCIEQDG RVHLTWYFG 300 

KEEOJEVKGI LENTSKAANF RNFTFIQLNG EFSRGKGLDV GARFWKGSNV LLFFCDVDIY 360 

FTSEFLNTCR LNTQPGKKVF YPVLFSQYNP GIIYGHHDAV PPLEQQLV1K KETGFWRDPG 420 

FGMTCQYRSD FZNZGGFDLD IKGWGGBDVH LYRXYLHSNL IWRTFVRGL FHLWHEKRCH 480 
DELTPEQYKM CMQSKAMNEA SHGQLGMLVF RHEIEAHLRK QKQKTSSKKT 



SEQ ID H&W PAJ5 DNA SEQUENCE 

Nuclete Add Accession* AF 169723 

Coding sequence: 1-2712 (underlined sequences correspond to start and stop radons) 



1 11 21 31 41 51 

I I I I I I 

ATGATTCCTG TATTGACATC AAAAAAAGCA AGTGAATTAC CAGTCAGTGA AGTTGCAAGC 60 

ATTCTCCAAG CTGATCTTCA GAATGGTCTA AACAAATGTG AAGTTAGTCA TAGGCGAGCC 120 

TTTCATGGCT GGAATGAGTT TGATATTAGT GAAGATGAGC CACTGTGGAA GAAGTATATT 180 

TCTCAGTTTA AAAATCCCCT TATTATGCTG C T T CT GSCT T CTGCAGTCAT CAGTGTTTTA 240 

ATGCATCAGT TTGATGATGC CGTCAGTATC ACTGTGGCAA TACTTATCGT TGTTACAGTT 300 

GCCTTTGTTC AGGAATATCG TTCAGAAAAA TCTCTTGAAG AATTGAGTAA ACTTGTGCCA 360 

CCAGAATGCC ATTGTGTGCG TGAAGGAAAA TTGGAGCATA CACTTGCCCG AGACTTGGTT 420 

CCAGGTGATA CAGTTTGCCT TTCTGTTGGG GATAGAGTTC CTGCTGACTT ACGCTTGTTT 480 

GAGGCTGTGG ATCTTTCCAT TGATGAGTCC AGCTTGACAG GTGAGACAAC GOCT TO TTCT S40 

AAGGTGACAG CTCCTCAGCC AGCTGCAACT AATGGAGATC TTGCATCGAG AAGTAACATT 600 

GCCTTTATGG GAACACTGOT CAGATGTGGC AAAGCAAAGG GTGTTGTCAT TGGAACAGGA 660 

GAAAATTCTG AATTTGGGGA GGTTTTTAAA ATGATGCAAG CAGAAGAGGC ACCAAAAACC 720 

CCTCTGCAGA AGAGCATGGA CCTCTTAGGA AAACAACTTT CCTTTTACTC CTTTGGTATA 780 

ATAGGAATCA TCATGTTGGT TGGCTGGTTA CTGGGAAAAG ATATCCTGGA AATGTTTACT 840 

ATTAGTGTAA GTTTGGCTGT AGCAGCAATT CCTGAAGGTC TCCCCATTGT GGTCACAGTG 900 

ACGCTAGCTC TTGGTGTTAT GAGAATGGTG AAGAAAAGGG CCATTGTGAA AAAGCTGCCT 960 

ATTGTTGAAA CTCTGGGCTG CTGTAATGTG ATTTGTTCAG ATAAAACTGG AACACTGACG 1020 

AAGAATGAAA TGACTGTTAC TCACATATTT ACTTCAGATG GTCTGCATGC TGAGGTTACT 1080 

GGAGTTGGCT ATAATCAATT TGGGGAAGTG ATTGTTGATG GTGATGTTGT TCATGGATTC 1140 

TATAACCCAG CTGTTAGCAG AATTGTTGAG GCGGGCTGTG TGTGCAATGA TGCTGTAATT 1200 

AGAAACAATA CTCTAATGGG GAAGCCAACA GAAGGGGCCT TAATTGCTCT TGCAATGAAG 1260 

ATGGGTCTTG ATGGACTTCA ACAAGACTAC ATCAGAAAAG CTGAATACCC TTTTAGCTCT 1320 

GAGCAAAAGT GGATGGCTGT TAAGTGTGTA CACCGAACAC AGCAGGACAG ACCAGAGATT 1360 

TGTTTTATGA AAGGTGCTTA CGAACAAGTA ATTAAGTACT GTACTACATA CCAGAGCAAA 1440 

GGGCAGACCT TGACACTTAC TCAGCAGCAG AGAGATGTGT ACCAACAAGA GAAGGCACGC 1500 

ATGGGCTCAG CGGGACTCAG AGTTCTTGCT TTGGCTTCTG GTCCTGAACT GGGACAGCTG 1S60 

ACATTTCTTG GCTTGGTGGG AATCATTGAT CCACCTAGAA CTGGTGTGAA AGAAGCTGTT 1620 

ACAACACTCA TTGCCTCAGG AGTATCAATA AAAATGATTA CTGGAGATTC ACAGGAGACT 1680 

GCAGTTGCAA TCGOCAGTCG TCTGGGATTG TATTCCAAAA CTTCCCAGTC AGTCTCAGGA 1740 

GAAGAAATAG ATGCAATGGA TGTTCAGCAG CTTTCACAAA TAGTACCAAA GGTTGCAGTA 1800 

TTTTACAGAQ CTAGCCCAAG GCACAAGATG AAAATTATTA AGTCGCTACA GAAGAAOGGT 1860 

TCAGTTGTAG CCATGACAGG AGATGGAGTA AATGATGCAQ TTGCTCTGAA GGCTGCAGAC 1920 
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ATTGOAOTTO CGATGGGCCA GACTGGTACA GATGTTTGCA AAGAGGCAGC AGACATGATC 1980 

CTAGTGGATG ATGATTTTCA AACCATAATG TCTGCAATOG AAGAGGGTAA AGGGATTTAT 2040 

AATAACATTA AAAATTTCGT TAGATTCCAG CTGAGCACGA GTATAGGAGC ATTAACTTTA 2100 

ATCTCATTGG CTACATTAAT GAACTTTCCT AATCCTCTCA ATGCCATGCA QATTTTGTGG 2160 

ATCAATATTA TTATGGATGG ACCCCCAGCT CAGAGCCTTG GAGTAGAACC AGTGGATAAA 2220 

GATGTCATTC GTAAACCTCC TCGCAACTGG AAAGACAGCA TTTTGACTAA AAACTTGATA 2280 

CTTAAAATAC TTGTTTCATC AATAATCATT GTTTGTGGGA CTTTGTTTGT CTTCTGGCGT 2340 

GAGCTACQAG ACAATGTGAT TACACCTCGA GACACAACAA TGACCTTCAC ATGCTTTGTG 2400 

TTTTTTGACA TGTTCAATGC ACTAAGTTQC AGATCCCAGA CCAAGTCTGT GTTTGAGATT 2460 

GGACTCTGCA GTAATAGAAT GTTTTGCTAT GCAGTTCTTG GATCCATCAT GGGACAATTA 2520 

CTAGTTATTT ACTTTCCTCC GCTTCAGAAG GTTTTTCAGA CTGAGAGOCT AAGCATACTG 2580 

GATCTGTTGT TTCTTTTGGG TCTCACCTCA TCAGTGTGCA TAGTGGCAGA AATTATAAAG 2640 

AAGGTTGAAA GGAGGAGGGA AAAGATCCAG AAGCATGTTA GTTCGACATC ATCATCTTTT 2700 
CTTGAAQT AT GA 

SEQ to NO:208 PAJ5 Protein sequence 
Protein Accession* AAF27813 

1 11 21 31 41 51 

I I I I I I 

HIPVLTSKKA SELPVSBVAS ILQADLQNGL NKCEVSHRRA FHGWNEFDIS EDEPLWKKYI 60 

SQFKNPLIML LLASAVZSVL MHQFDDAVSI TVAILIWTV AFVQEYRSEK SLEELSKLVP 120 

PECHCVREGK LEHTLABDLV PGDTVCLSVG DRVPADLRLP EAVDLSXDES SLTGETTPCS 180 

KVTAPQPAAT NGDLASKSNX AFMGTLVRCG KAKGWIGTG ENSEFGEVFK MMQAEEAPKT 240 

PLQKSMDLLG KQLSFYSFGI IGIIKLVGWL LQKDILBHFT ISVSLAVAAI PEGLPIWW 300 

TLAU3VMRMV KKRAIVKKLP IVETLGCCHV ICSDKTGTLT KNEMTVTHIP TSDGLHAEVT 360 

GVGYNQFGEV IVDGDWHGF YNPAVBRIVE AGCVCNDAVI HNNTLKGKPT EGALIALAHK 420 

MSLDGLQQDY IRKAEYPFSS EQKWMAVKCV HRTQQDRPEI CFMKGAYEQV IKYCTTYQSK 480 

GQTLTLTQQQ RDVYQQEKAR KGSAGLRVLA LASGPELGQL TFLGLVGIID PPRTCVKEAV 540 

TTLIASGVSI KMITGDSQET AVAIASRLGL YSKTSQSVSG EKIDAMDVQQ LSQIVPKVAV 600 

FYRASPRHKM KIZKSLQKKG SWAMTGDGV NDAVALKAAD IGVAKGQTGT DVCKEAAEMI 660 

LVDDDFQTIM SAIEEGKGIY KNIKNFVRFQ LSTSIAALTL ISLATLMNPP NPLNAMQILW 720 

INIIKDGPPA QSLGVEPVDK DVXHKPPRUW KDSILTKNLI LK1LVSSIII VCGTLFVFWR 780 

ELHDNVITPR DTTttTPTCPV FFDMPNALSS RSQTKSVFEI GLCSKRMFCY AVLGSIMGQL 840 

LVTYFPPLQK VFQTESLSIL DLLFLLGLTS SVCIVAEIIK KVERSREKIQ KHVSSTSSSP 900 
LEV 

SEQ ID NO209 PAV4 VARIANT 1 DNA SEQUENCE 

Nucleic Add Accession (h NB2098 

Coding sequence: 1-1284 (underlined sequences correspond to start and stopcodons) 



1 11 21 31 41 51 

I I 1 1 I I 

ATGQ GCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGA GAGGATTGCC TTATTCAATG m 60 

AAGCAAGCTG GGTTTCCTTT GGGAATATTG CTTTTATTCT GGGTTTCATA TGTTACAGAC 120 

TTTTCCCTTG TTTTATTGAT AAAAGGAGGG GCCCTCTCTG GAACAGATAC CTACCAGTCT 180 

TTGGTCAATA AAACTTTCGG CTTTCCAGGG TATCTGCTCC TgrCT GT TCT TCAGTTTTTG 240 

TATCCTTTTA TAGCAATGAT AAGTTACAAT ATAATAGCTG GAGATACTTT GAGCAAAGTT 300 

TTTCAAAGAA TCCCAGGAGT TGATCCTGAA AACGTGTTTA TTGGTCGCCA CTTCATTATT 360 

GGACTTTCCA CAGTT ACCTT TACTCTGCCT TTATCCTTGT ACCGAAATAT AGCAAAGCTT 420 

GGAAAGGTCT CCCTCATCTC TACAGGTTTA ACAACTCTGA TTCTTGGAAT TGTAATGGCA 480 

AGGGCAATTT CACTGGGTCC ACACATACCA AAAACAGAAG ACGCTTGGGT ATTTGCAAAG 540 

CCCAATGCCA TTCAAGCGGT OGGGGTTATG TCTTTTGCAT TTATTTGCCA CCATAACTCC 600 

TTCTTAGTTT AC AQTTCTCT AGAAGAACCC ACAGTAGCTA AGTGGTCCCG CCTTATCCAT 660 

ATGTCCATCG TGATTTCTGT ATTTATCTGT ATATTCTTTG CTACATGTGG ATACTTGACA 720 

TTTACTGGCT TCACCCAAGG GGACTTATTT GAAAATTACT GCAGAAATGA TGACCTGGTA 780 

ACATTTGGAA GATTTTGTTA TGGTGTCACT GTCATTTTGA CATACCCTAT GGAATGCTTT 840 

GTGACAAGAG AGGTAATTGC CAATGTGTTT TTTGGTGGGA ATCTTTCATC GGTTTTCCAC 900 

ATTGTTGTAA CAGTGATGGT CATCACTGTA GCCACGCTTG TGTCATTGCT GATTGATTGC 960 

CTCGGGATAG TTCTAGAACT CAATGGTGTG CTCTGTGCAA CTCCCCTCAT TTTTATCATT 1020 

CCATCAGOCT GTTATCTGAA ACTGTCTGAA GAACCAAGGA CACACTCCGA TAAGATTA«fG 1080 

TCTTOTGTCA TGCTTCCCAT TGGTGCTGTG GTGATGGTTT TTGGATTCGT CATGGCTATT 1140 

ACAAATACTC AAGACTGCAC CCATGGGCAQ GAAATGTTCT ACTGCTTTCC TGACAATTTC 1200 

TCTCTCACAA ATACCTCAGA GTCTCATGTT CAGCAGACAA CACAACTTTC TACTTTAAAT 1260 
ATTAGTATCT TTCAACTCGA GTAA 



SEQ B N0:210 PAV4 Vmtent 1 Pmtefn Bjffiiejffig 
Protein Accession ft none found 

1 11 21 31 41 51 

I I I I I I 

MGYQRQEPVI PPQRGLPYSM KQAGPPLGIL LLPWVSYVTD PSLVLLIKGG ALSGTDTYQS 60 
LVNKTPGFPG YLLLSVLQFL YPPTAMISYN IIAGDTLSKV FQRIPGVDPE NVFIGRHPII 120 
GLSTVTFTLP LSLYHNIAKL GKVSLISTGL TTLILGIVHA RAISLGPHZP KTEDAWVPAK 180 
PNAIQAVGVM SFAFICHHNS PLVYSSLEEP TVAKWSRLIH HSIVISVPIC IPFATCGYLT 240 
FTGFTQGDLP ENYCRNDDLV TPGRPCYGVT VILTYPMECF VTREVIANVP PGGNLSSVFH 300 
IWTVMVTTV ATLVELLIDC LGIVLBLHGV LCATFLIFII PSACYLKLSE BPRTHSDKM 360 
SCVMLPIGAV VMVPGFVMAI TNTQOCTHGQ EMPYCFPDNF SLTNTSESHV QQTTQLSTLN 420 
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ISIPQLE 

SEQIDN&211 PAV4 VARIANT 2 DNA SEQUENCE 

Nucielc Add Accession ft N62096 

Coring sequence 1-1203 (undefined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

[ I I I I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGT TTTCCCTTGT TTTATTGATA 60 

AAAGGAGGGG CCCTCTCTGG AACAGATACC TACCAGTCTT TGGTCAATAA AACTTTCGGC 120 

TTTCCAGGGT ATCTGCTCCT CTCTGTTCTT CAGTTTTTGT ATCCTTTTAT AGCAATGATA 180 

AGTTACAATA TAATAGCTGG AGATACTTTG AGCAAAGTTT TTCAAAGAAT CCCAGGAGTT 240 

GATOCTGAAA ACGTGTTTAT TGGTOGCCAC TTCATTATTO GACTTTCCAC AGTTACCTTT 300 

ACTC TGCCT T TATCCTTGTA CCGAAATATA GCAAAGCTTG GAAAGGTCTC CCTCATCTCT 360 

ACAGGTTTAA CAACTCTGAT TCTTGGAATT GTAATGGCAA GGGCAATTTC ACTGGGTCCA 420 

CACATAOCAA AAACAGAAGA CGCTTGGGTA TTTGCAAAGC CCAATGCCAT TCAAGCGGTC 480 

GGGGTTATGT CTTTTGCATT TATTTGCCAC CATAACTCCT TCTTAGTTTA CAG'fTCTCTA 540 

GAAGAACCCA CAGTAGCTAA GTGGTCCCGC CTTATCCATA TGTCCATCGT GATTTCTGTA 600 

TTTATCTGTA TATTCTTTGC TACATGTGGA TACTTGACAT TTACTGGCTT CACCCAAGGG 660 

GACTTATTTG AAAATTACTG CAGAAATGAT GACCTGGTAA CATTTGGAAG ATTTTGTTAT 720 

GGTGTCACTG TCATTTTGAC ATACCCTATG GAATGCTTTG TGACAAGAGA GGTAATTGCC 780 

AATGTGTTTT TTGGTGGGAA TCTTTCATCG GTTTTCCACA TTGTTGTAAC AGTGATGGTC 840 

ATCACTGTAG CCACGCTTGT GTCATTGCTG ATTGATTGCC TCGGGATAGT TCTAGAACTC 900 

AATGGTGTGC TCTGTGCAAC TCCCCTCATT TTTATCATPC CATCAGCCTG TTATCTGAAA 960 

CTGTCTGAAG AACCAAGGAC ACACTCCGAT AAGATTATGT CTTGTGTCAT GCTTCCCATT 1020 

GOTGCTGTGG TGATGGTTTT TGGATTCGTC ATGGCTATTA CAAATACTCA AGACTGCAOC 1080 

CATGGGCAGG AAATGTTCTA CTGCTTTCCT GACAATTTCT CTCTCACAAA TACCTCAGAG 1140 

TCTCATGTTC AGCAGACAAC ACAACTTTCT ACTTTAAATA TTAGTATCTT TCAACTCGAG 1200 
TAA 



SEQ ID NOi212 PAV4 Variant 2 Pmtarn aflffiOBg 

Protein Accession ft none found 



l 11 21 31 41 51 

! 1 1 I I 1 

MGYQRQBPVI PPQFSLVLLI KGGALSGTDT YQSLVNKTFG FPGYLLLSVL QFLYPFIAMI 60 

SYNIIAGDTL SKVFQRXFGV DPENVFIGRH FIIGLSTVTF TLPLSLYRNI AKLGKVSLIS 120 

TGLTTLILGI VMARAISLGP HIPKTEDAWV PAKPNAIQAV GVMSFAFICH HNSPLVYSSL 180 

EEPTVAKWSR LIHKSIVISV FICIFPATCG YLTFTGPTQG DLFENYCKND DLVTFGRFCY 240 

GVTVILTYPM ECFVTREVIA KVFFGGNLSS VFHIWTVHV ITVATLVSLL IDCLG1VLEL 300 

HGVLCATPXil FI1PSACYLK LSEEFRTHSD KIMSCVMLPI GAWMVFGFV HAITNTQDCT 360 
HGQEMFYCFP DNFSLTNTSE SHVQQTTQLS TLNISIFQLE 

SEQ ID HOWS PAV4 VARIANT 3 DNA SEQUENCE 

Nuctete Add Accession ft N62096 

Codng sequence: 1-1 140 (underlined se(piences correspond to stajtarri stop codons) 

1 11 21 31 41 51 

1 I I I 1 I 

ATGGGCTACC AGAGGCAGGA G CCTGTCAT C CCGCCGCAGG T CAATA AAAC TT TCG GCTTT 60 

CCAGGGTATC TGCTCCTCTC TGTTCTTCAG TTTTTGTATC CTTTTATAGC AATGATAAGT 120 

TACAATATAA TAGCTGGAGA TACTTTGAGC AAAGTTTTTC AAAGAATCCC AGGAGTTGAT 180 

CCTGAAAACG TGTTTATTGG TCGCCACTTC ATTATTGGAC TTTCCACAGT TACCTTTACT 240 

CTGCCTTTAT CCTTGTACCG AAATATAGCA AAGCTTGGAA AGGTCTCCCT CATCTCTACA 300 

GGTTTAACAA CTCTGATTCT TGGAATTGTA ATGGCAAGGG CAATTTCACT GGGTCCACAC 360 

ATACCAAAAA CAGAAGACGC TTGGGTATTT GCAAAGCCCA ATGCCATTCA AGCGGTCGGG 420 

GTTATGTCTT TTGCATTTAT TTGCCACCAT AACTCCTTCT TAGTTTACAG TTCTCTAGAA 480 

GAACCCACAG TAGCTAAGTG GTCCCGCCTT ATCCATATGT CCATCGTGAT TTCTQTATTT 540 

ATCTGTATAT TCTTTGCTAC ATGTGGATAC TTGACATTTA CTGGCTTCAC CCAAGGGGAC 600 

TTATTTGAAA ATTACTGCAG AAATGATGAC CTGGTAACAT TTGGAAGATT TTGTTATGGT 660 

GTCACTGTCA TTTTGACATA CCCTATGGAA TGCTTTGTGA CAAGAGAGGT AATTGCCAAT 720 

GTGTTTTTTO GTGGGAATCT TTCATCGGTT TTCCACATTG TTGTAACAGT GATGGTCATC 780 

ACTGTAGCCA CGCTTGTGTC ATTGCTGATT GATTGCCTCG GGATAGTTCT AGAACTCAAT 840 

GGTGTGCTCT GTGCAACTCC CCTCATTTTT ATCATTCCAT CAGCCTGTTA TCTGAAACTG 900 

TCTGAAGAAC CAAGGACACA CTCCGATAAG ATTATGTCTT GTGTCATGCT TCCCATTGGT 960 

GCTGT GG T G A TGGTTTTTGG ATTCGTCATG GCTATTACAA ATACTCAAGA CTGCACCCAT 1020 

GGGCAGGAAA TGTTCTACTG C TTTCCTG AC AATTTCTCTC TCACAAATAC CTCAGAGTCT 1080 
CATGTTCAGC AGACAACACA ACTTTCTACT TTAAATATTA GTATCTTTCA ACTCGA GTAA 

S E Q IP N9#14 PAV4 Vg^"t ? Protefr cwwncfi 
Protein Accession 4: none found 

1 11 21 31 41 51 

I I I I I I 
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MGYQRQEPVI PFQVNKTFGF PGYLLLSVLQ FLYPFIAMIS YNIIAGDTLS KVPQRIPGVD 60 

PENVFIGHHF IIGLSTVTFT LPLSLYRNIA KLOKVSLIST OLTTLILGXV HARAISLGPH 120 

IFKTBDAWVF AKPNAXQAVG VMSFAFICHH NSFLVYSSLE EPTVAKWSRL IHKSIVISVF 180 

ICIFFATCGY LTPT6FTQGD LFENYCRNDD LVTPGRFCVO VTVILTCTMB CPVTREVIAN 240 

VFFGGNLSSV FHIWTVMVI TVATLVSLLI DCLGIVLELN GVLCATPLIF XIPSACYLKL 300 

SEEPRTHSDK IMSCVMLPIG AWMVFGFVM AITNTQDCTH GQEHPYCFPD NFSLTNTSES 360 
HVQQTTQLST LNTSIFQLE 



SEQ ID N0215 PAV4 VARIANT 4 DNA SEQUENCE: 

Nudek: Add Accession ft N62098 

Coding sequence 1-1 339 (underlined sequences correspond to start and slop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGQOCTAOC AGAGGCAGGA GCCTGTCATC CC0CC6CAGA GAGATTTAGA TGACAGAGAA . 60 
ACCCTTGTTT CTGAACATGA GTATAAAGAG AAAACCTGTC AGTCTGCTGC TCTTTTTAAT 120 
GTTGTCAACT CGATTATAGG ATCTGGTATA ATAGGATTGC CTTATTCAAT GAAGCAAGCT 180 
GGGTTTCCTT TGGGAATATT GCTTTTATTC TGQGTTTCAT ATGTTACAGA CTTTTCCCTT 240 
GTTTTATTGA TAAAAGGAGG GGCCCTCTCT GGAACAGATA CCTACCAGTC TTTGGTCAAT 300 
AAAACTTTCG GCTTTCCAGG GTATCTGCTC CTCTCTGTTC TTCAGTTTTT GTATCCTTTT 360 
ATAGCAATGA TAAGTTACAA TATAATAGCT GGAGATACTT TGAGCAAAGT TTTTCAAAGA 420 
ATCCCAGGAG TTGATCCTGA AAACGTGTTT ATTGGTCGCC ACTTCATTAT TGGACTTTCC 480 
ACAGTTACCT TTACTCTGCC TTTATCCTTG TACCGAAATA TAGCAAAGCT TGGAAAGGTC 540 
TCCCTCATCT CTACAGGTTT AACAACTCTG ATTCTTGGAA TTGTAATGGC AAGGGCAATT 600 
TCACTGGGTC CACACATACC AAAAACAGAA GACGCTTGGG TATTTGCAAA GCCCAATGCC 660 
ATTCAAGCGG TCGGGGTTAT GTCTTTTGCA TTTATTTGGC ACCATAACTC CTTCTTAGTT 720 
TACAGTTCTC TAGAAGAACC CACAQTAQCT AAGTGGTCCC GCCTTATCCA TATGTCCATC 780 
GTGATTTCTG TATTTATCTG TATATTCTTT GCTACATGTG GATACTTGAC ATTTACTGGC 640 
TTCACOCAAG GGGACTTATT TGAAAATTAC TCCAGAAATG ATGACCTGGT AACATTTGGA 900 
AGATTTTGTT ATGGTGTCAC TGTCATTTTG ACATACCCTA TGGAATGCTT TGTGACAAGA 960 

GAGGTAATTG CCAATGTGTT TTTTGGTGGG AATCTTTCAT CGOTTTTCCA CATTGTTGTA 1020 

ACAGTGATGG TCATCACTGT AGCCACGCTT GTGTCAITGC TGATTGATTG CCTOGGGATA 1080 

GTTCTAGAAC TCAATGGTGT GCTCTGTGCA ACTCCCCTCA TTTTTATCAT TCCATCAGCC 1140 

TGTTATCTGA AACTGTCTGA AGAACCAAGG ACACACTCCG ATAAGATTAT GTCTTGTGTC 1200 

ATGCTTCCCA TTGGTGCTGT GGTGATGGTT TTTGGATTCG TCATGGCTAT TACAAATACT 1260 

CAAGACTGCA CCCATGGGCA GGAAATGTTC TACTGCTTTC CTGACAATTT CTCTCTCACA 1320 

AATACCTCAG AGTCTCATGT TCAGCAGACA ACACAACTTT CTACTTTAAA TATTAGTATC 1380 
TTTCAATGA 



SEQ ID NQglB PAV4 Variant 4 Protein sefluencg 
Protein Accession ft none found 



1 11 21 31 41 51 

1 I I I I I 

HGYQRQEPVI PPQRDLDDRE TLVSEHEYKE KTCQSAALFN WNSIIGSGI IGLPYSMKQA 60 

GFPLGILLLF WVSYVTDFSL VLLUCGGALS GTDTYQSLVN KTFGFPGYLL LSVLQFLYPF 120 

IAMISYN1IA GDTLSKVFQR IPGVDPENVF XGRHFXIGLS TVTFTLPLSL YRNIAKLGKV 180 

SLISTGLTTL ILGIVMARAI SLGPHIPKTE DAWVFAKFNA XQAVGVHSFA FICHHNSPLV 240 

YSSliEBPTVA KWSRT.THM5I VZSVFICIFF ATCGYLTFTG FTQGDLFENY CRNDDLVTFG 300 

RFCYGVTVIL TYPMECTVTR EVTANVFPGG NLSSVFHIW TVMVITVATL VSLL1DCLGI 360 

VLELNGVLCA TPLIPIIPSA CYLKLSEEPR THSDKIMSCV MLPIGAWMV PGFVMAITNT 420 
QDCTHGQEMF YCFPDNFSLT KTSESHVQQT TQLSTLNISI PQ 

SEQ ID N0-.217PAV9 DNA SEQUENCE 

Nucleic Add Accession ft NH.017638 

Cooing sequence: 1-3501 (underlined sequences correspond to slartawistcoc^ns) 



1 11 21 31 41 51 

I I I I I I 

ATGGAGGATG CCTTCGGGGC AGCCGTGGTG ACCGTGTGGG ACAGCGATGC ACACACCACG 60 

GAQAAGCCCA CCGATGCCTA CGGAGAGCTG GACTTCACGG GGGCCGGCCG CAAGCACAGC 120 

AATTTCCTCC GGCTCTCTGA CCGAACGGAT CCAGCTGCAG TTTATAGTCT GGTCACACGC 180 

ACATGGGGCT TCCGTGCCCC GAACCTGGTG GTGTCAGTGC TGGGGGGATC GGGGGGCCCC 240 

GTCCTCCAGA CCTGGCTGCA GGACCTGCTG CGTCGTGGGC TGGTGCGGGC TGCCCAGAGC 300 

ACAGGAGCCT GGATTGTCAC TGGGGGTCTG CACACGGGCA TCGGCCGGCA TGTTGGTGTG 360 

GCTGTACGGG ACCATCAGAT GGCCAGCACT GGGGGCACCA AGGTGGTGGC CATGGGTGTG 420 

GCCCCCTGGG GTGTGQTCCG GAATAGAGAC ACCCTCATCA ACCCCAAGGG CTCGTTCCCT 480 

GCGAGGTACC GGTGGCGCGG TGACCCGGAG GACGGGGTCC AGTTTCCCCT GGACTACAAC 540 

TACTCGGCCT TCTTUUTGGT GGACGACGGC ACACACGGCT GCCTGGGGGG CGAGAACCGC 600 

TTCCGCTTGC GCCTGGAGTC CTACATCTCA CAGCAOAAGA CGGGCGTGGG AGGGACTGGA 660 

ATTGACATCC CTGTCCTGCT CCTCCTGATT GATGGTGATO AGAAGATGTT GACGCGAATA 720 

GAGAACGCCA CCCAGGCTCA GCTCCCATGT CVCCTCG'lGG CTGGCTCAGG GGGAGCTGOG 780 

GACTGCCTGG CGGAQACCCT GGAAGACACT CTGGCCCCAG GGAGTGGGGG AGCCAGGCAA 840 

GGCGAAGCCC GAGATCGAAT CAGGCGTTTC TTTCCCAAAG GGGACCTTGA GGTCCTGCAG 900 

GCCCAGGTGG AGAGGATTAT GACCCGGAAG GAGCTCCTGA CAGTCTATTC TTCTGAGGAT 960 
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GGGTCTGAGG AATTOGAGAC CATAGTTTTG AAGGCCCTTG TGAAGGCCTG TGGGAGCTCO 1020 

GAGGCCTCAG CCTACCTGGA TGAQCTGCGT TTGGCTGTGG CTTGGAACCG CGTGGACATT 1080 

GCCCAGAGTG AACTCTTTCG GGGGGACATC CAATGGCGQT CCTPCCATCT CGAAGCTTCC 1140 

CTCATGGACG CCCTGCTGAA TGACCG6CCT GAGTTCGTGC GCTTGCTCAT TTOCCACGGC 1200 

CTCA60CT66 OOCACTTCCT GACCCCGATG CGCCTGGCCC AACTCTACAG CGCGGCGCCC 1260 

TCCAACTCGC TCATCCGCAA CCTTTT66AC CAOGCOTCCC ACAGCGCAGG CACCAAAGCC 1320 

CCACCCCTAA AAOGGGGAGC TGCGGAGCTC CGGCCCCCTC AOGTGGGGCA TGTGCTGAGG 1380 

ATGCT6CT66 GGAAGATGTG C6CGCC6AGG TACCCCTCCG GGGGCGCCTG GGACCCTCAC 1440 

CCAGGCCAGG G C TTC G GGGA GAGCATGTAT CTGCTCTCGG ACAAGGCCAC CTCGOCGCTC 1500 

TCGCTGGATQ CTGGCCTCGG GCAGGCCCCC TGGAGCGACC TGC T T C TT TC GGCACTGTTQ 1560 

CTGAACAGGO CACAGATGGC CATGTACTTC TGGGAGATGG GTTCCAATGC AGTTTCCTCA 1620 

GeA CT T SCGO OCTGTTTQCT GCTCCGGGTG ATGGCACGCC TGGAGCCTGA CGCTGAGGAG 1680 

GCAGCACGGA GGAAA6ACCT GGCGTTCAAG TTTGAGGGGA TGGGCGTTGA CCTCTTTGGC 1740 

GAOTGCTATC GCAGCAGTGA GGTGAGGGCT GCCOGCCTOC TCCTCCGTCG CTGCCCGCTC 1800 

TGGGGGGATG CCACTTGCCT CCAGCTGGCC ATGCAAOCTQ ACGCCCGTGC CTTCTTTGCC 1860 

CAGGATGGGG TACAGTCTCT GCTGACACAG AAGTGGTGGG GAGATATGGC CAGCACTACA 1920 

C CCATC TGGG CCCTGGTTCT CGCCTTCTTT TGCCCTOCAC TCAT CTACA C CCGCCTCATC 1980 

ACCTTCAGGA AATCAGAAGA GGAGCCCACA C6GGAGGAGC TAQAGTTTGA CATGGATAGT 2040 

GTCATTAATG GGGAAGGGCC TGTCGGGACG GCGQACOCAO CCGAGAAGAC GCCGCTGGGG 2100 

GTC0C6CG0C ASTCGGG CCG TCCGGGTTGC TGCGGGGGCC GCTGCG GGGG GCGCCGGTGC 2160 

CTACGCCGCT GGTPCCACTT CTGGGGCGCG CCGGTGACCA TCTTCATGGG CAACGTGGTC 2220 

AGCTACCTGC TGTTCCTGCT GCTTTTCTCQ CGGOTOCTQC TCGTGGATTT CCAGCCGGCQ 2280 

CCGCCCG6CT CCCTGGAGCT GCTGCTCTAT TTCTGGGCTT TCACGCTGCT GTGCGAGGAA 2340 

CTGCGCCAGG GOCTGAGCGG AG6CGGGG6C AGCCTCGCCA GCGGGGGCCC C6G6CCTGGC 2400 

CATGCCTCAC TGAGCCAGCG CCT6CGCCTC TACCTCGCCO ACAGCTGGAA CCAGTGCGAC 2460 

CTAGTGGCTC TCACCTGCTT CCTCCTGGGC GTGGGCTGCC OGCTQACCCC GGGTTTOTAC 2520 

CACCT6G6CC 6CACTGTCCT CTGCATCGAC TTCATGGTTT TCACGGTGCO GCTGCTTCAC 2580 

ATCTTCACGG TCAACAAACA GCTGGGGCCC AAGATCGTCA TCGTGAGCAA GATGATGAA6 2640 

GACGTGTTCT TCTTCCTC T T CTTCCTCGGC GTGTGGCTGQ TAGCCTAT6G CGTG GC C A CO 2700 

GAG6GGCTCC TGAGGCCAOO GGACAGTGAC TTCCCAAGTA TCCTGCGCCO CGTCTTCTAC 2760 

OCTCCCTACC TGCAGATCTT CGGGCAGATT CCCCAGGAGG ACATGGACGT GGCCCTCATG 2820 

GAOC ACAG CA ACT6CTCGTC GGAGCCCGGC TTCTGGGCAC ACCCTCCTGG GGCCCAGGCG 2880 

GGCAOCTGCO TCTCCCAGTA T6CGAACT6G CTGGTGCTGC TGCTCCTCGT CATCTTCCTG 2940 

CTCGTGGCCA ACATCCTGCT GGTCAACTTO CTCATTGCCA TQTTCAGTTA CACATTCGQC 3000 

AAAGTACAGO GCAACAGCGA TCTCTACTGG AAGGCGCAGC GTTACCGCCT CATCCGGGAA 3060 

TTCCACTCTC GGCCCGCGCT GGCCCCGCOC TTTATCGTCA TCTCCCACTT 6C6CCT0CTG 3120 

CTCAGGCAAT TGTGCAGGCG ACCCCGGAGC CCCCAGCCGT CCTCCCCGGC CCTC6AGCAT 3180 

TTCCG6GTTT ACCTTTCTAA GGAAGCCGAQ CGGAAGCTGC TAAC6T6GGA ATCGGTGCAT 3240 

AA66A0AACT TT CT GCTGGC ACGCGCTAGG GACAAGCGG8 AGAGCGACTC CGAGCGTCTG 3300 

AAGCGCACGT CCCAGAAGGT GGACTTGGCA CTGAAACAGC T6GGACACAT CCGCGAGTAC 3360 

GAACAGCGCC TGAAA5TGCT GGAGCGGGAG GTCCAGCAGT GTAGCCGCGT CCTGGGGTGG 3420 

GTGGOCGAGG CCCTGAGCCG CTCTGCCTTQ CTGCCCCCAG GTGGGCCGCC ACCCCCTGAC 3480 
CTGCCTGGGT CCAAAGACTG A 



S£Q(PNP318PAVgPn 

Protolri Accession #i Rons found 

1 11 21 31 41 51 

I I I I I I 

MEDAFGAAW TVWDSDAHTT EKPTDAYGEL DFTGAGRKHS NPLRLSDRTD PAAVYSLVTR 60 

TVTGFRAPNLV V5VLGGSGGP VLQTWLQDLL RRGLVRAAQS TQAWXVTGGL HTGIGRHVGV 120 

AVRDHQMAST GGTKWAHGV APWGWRHRD TLINPKGSPP ARYRWRGDPB DGVQFPLDYN 180 

YSAFFLVDDG THGCLGGENR PRLRLESYIS QQKTGVGGTG IDZFVLLLLZ DGDEXMLTRI 240 

EHATQAQLPC LLVAGSGGAA DCLAETLEDT LAPGSGGARQ GBARDRIRRF FPKGDLEVLQ 300 

AQVERIMTRK ELLTVYSSED GSEEFETXVL KALVKACGSS EASAYLDELR LAVAWNRVDI 360 

AQSBLFRGDI QWRSFHLEAS IHOALLNDRF EFVRLLZGEG LSLGHFLTPM RLAQLYSAAP 420 

SNSLIRNLLD QASHSAGTKA PALKGGAAEL RPPDVGHVLR MLLGKKCAFR YPSGGAWDPH 480 

PGQGFGESHY LLSDKATSFL SLDAGLGQAP WSDLLLWALL LKRAQMAMYF WKMGSNAVSS 540 

ALGACLLLKV MARLEPDAEE AARRKDLAFK FEGMQVDLFG ECYRSSEVRA ARLLLRRCPL 600 

WGDATCLQLA MQADARAFFA QDGVQSLLTQ KWWGDMASTT PIHALVLAFF CPPLIYTRLI 660 

TFRKSBEEPT REELEFDMDS VINGBGPVGT ADPAEKTPLG VPRQSGRPGC CGGRCGGRRC 720 

LRRWFHFWGA FVTIFMGNW SYLLFLLLFS RVLLVDFQPA PPOSLELLLY FWAFTLLCEE 780 

LRQGLSGGGG SLASGGPGPG HASLSQRLRL YLADSWNQCD LVALTCFLLG VGCRLTPGLY 840 

HLGRTVLCID PMVFTVRLLH IFTVNKQLGP KIVXVSKMMK DVFPFLFFLG VWLVAYGVAT 900 

EGLLRPRDSD FPSILRRVFY RPYLQIFGQI PQEDMDVALM EHSNCSSEPG FWAHPPGAQA 960 

GTCVSQYAHW LWLLLVIFL LVANILLVNL LIAMFSYTFG KVQGNSDLYW KAQRYRLIRE 1020 

PHSRPALAPP FIVTSHLRIiL LRQLCRRPRS PQPSSPALEH FRVYLSKEAE RKLLTWESVH 1080 

KENFLLARAR DKRESDSERL KRTSQKVDLA LKQLGHIRBY EQRLKVLERE VQQCSRVLGW 1140 
VAEALSRSAL LPPGGPPPPD LFGSKD 

SEQ ID N0:219 PBf 1 DNA SEQUENCE 

Nudeic Acid Accessions AAQ54237 

Cooing sequence: 1*894 (imdeffined sequences correspond to start and slop coders) 

1 11 21 31 41 51 

I I I I I I 

ATGGAGCCGC GG GC GCTCGT CACGGCGCTC AGCCTCGGCC TCAGCCTGTG CTCCCTGGGG 60 

CTGCTCGTCA CGGCCATCIT CACCGACCAC TGGTACGAGA CCGACCCCCG GCGCCACAAG 120 

GAGAGCTGCG AGCGCAGCCG CGCGGGCGCC GACCCCCCGG ACCAGAAGAA CCGCCTGATG 180 
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CCGCTGTCGC ACCTGCCGCT GCGGGACTCG CCCCCGCTGG G6CG0C6GCT G C TOOOG G GC 240 

GGCCCGGGGC OCGCCGACCC CGAGTCCTGG CGCTCGCTOC TCC06CTCGQ CGGGCTGGAC 300 

GCCGAGTGCG GCCGGCCCCT CTTCGCCACC TACTCGGGCC TCTGGAGGAA GTGCTACTTC 360 

CTGGGCATOG ACCGGGACAT CGACACCCTC ATCCTGAAAG GTATTGCQCA QCGATQCACG 420 

5 GCCATCAAGT ACCACTTTTC TCAGCCCATC OGCTTGOGAA ACATTCCTTT TAATTTAAOC 480 

AAGACCAIAC AGCAAGATGA GTGGCACCTO CTTCATTTAA GAAGAATCAC TGCTGGCTTC 540 

CTCGGCATGG CCGTAGCCGT CCTTCTCTGC GGCTGCATTG TGGCCACAGT CAGTTTCTTC 600 

TGGGAGGAGA GCTTGACCCA GCACGTGGCT GGACTCCTGT TCCTCATGAC AGGGATATTT 660 

TGCACCATTT CXXTCTGTAC TTATGCCGCC AGTATCTCGT ATGATTTGAA CCGGCTCCCA 720 

10 AAGCTAATTT ATAGCCTGCC TGCTGATGTG GAACATGGTT ACAGCTGGTC CATCTTTTGC 780 

GCCTGGTGCA GTTTAGGCTT TATTGTGGCA GCTGGAGGTC TCT6CAT06C TTATCCGTTT 840 
ATTAGCCGGA CCAAGATTGC ACAQCTAAAO TCTGGCAGAS ACTCCAC6GT ATGA 

IS SEQIDNOS20PBF1 Protein semsence: 

Protein Accession*: none- found 

1 11 21 31 41 51 

orfc I ! I I I I 

2X) MEPRALVTAL SLGLSLCSLG LLVTAIFTDH VfYETDFRRHK ESCERSRAGA DPPDQKNRLM 60 

PLSHLPLRDS PPLGRRLLPG GPGRADPESVf RSLLGLGGLO AECGRPLPAT YSGLWRKCYF 120 

LGIDRDIDTL ILKGIAQRCT AIKVHFSQPI RLRNIPFNLT KTIQQDEWHL LHLRRITAGP 180 

LGMAVAVLLC GCIVATVSFP WEESLTQHVA GLLFLMTGIF CTISLCTYAA SISYDLNRLP 240 
KLIYSLFABV EHOYSWSIFC AWCSLGPIVA AGGLCIAYPP ISRTRTAQLK SGRDSTV 



25 
30 



SEQ ID NO:221 PCM DMA SEQUENCE 

Nucleic Acid Accession t: N&L016S70 

Coding sequence: 1- 1 134 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGA GGCGAC TGAATCGGAA AAAAACTTTA A0TTTGGTAA AA6A6TT66A T6CCTTTCCG 60 

35 AAGGTTCCTO AGAGCTATGT AGAGACTTCA GCCAGTGGAG GTACAGTTTC TCTAATAGCA 120 

TTTACAACTA TG6C7TTATT AACCATAATG GAATTCTCAG TATATCAAGA TACATGGATQ 180 

AAGTATGAAT ACGAAGTAGA CAAGGATTTT TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

ACTGTIGCCA TGAAGTGTCA ATATGTTGGA GCGGATGTAT TGGATTTAGC AGAAACAATG 300 

GTTGCATCTG CAGATGGTTT AGTTTATGAA CCAACAGTAT TTGATCTTTC ACCACAGCAG 360 

40 AAAGAGTGGC AGAGGATGCT GCAGCTGATT CAGAGTAGGC TACAAGAAGA GCATTCACTT 420 

CAAGATGTGA TATTTAAAAG TGCTTTTAAA AGTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

GATGATTCAT CACAGTCTCC AAATGCATGC AGAATTCATG GCCATCTATA TGTCAATAAA 540 

GTAGCAGGGA ATTTTCACAT AACAGTGGGC AAGGCAATTC CACATCCTCG TCGTCATGCA 600 

CATTTGGCAG CACTTGTCAA OCATGAATCT TACAATTTTT CTCATAGAAT AGATCATTTG 660 

45 TCTTTTGGAG AGCTTGTTCC AGCAATTATT AATCCTTTAG ATGGAACTGA AAAAATTGCT 720 

ATAGATCACA ACCAGATGTT CCAATATTTT ATTACAGTTG TGCCAACAAA ACTACATACA 780 

TATAAAATAT CAGCAGACAC CCATCAGTTT TCTGTGACAG AAAGGGAACG TATCATTAAC 840 

CATGCTGCAG GCAGCCATGG AGTCTCTGGG ATATTTATGA AATATGATCT CAGTTCTCTT 900 

ATGGTGACAG TTACTGAGGA GCACATGCCA TTCTGGCAGT TTTTTGTAAG ACTCTGTGGT 960 

50 ATTGTTGGAG GAATCTTTTC AACAACAGGC ATGTTACATG GAATTGGAAA ATTTATAGTT 1020 

GAAATAATTT GCTGTCGTTT CAGACTTGGA TCCTATAAAC CTGTCAATTC TGTTCCTTTT 1080 
GAGGATGGCC ACACAGACAA CCACTTACCT CTTTTAGAAA ATAATACACA TTGA 

55 SEQ ID 110:222 PCM Protehi sequence: 

Protein Accession #: NPJJ57654 

1 11 21 31 41 51 

<n I I 1 1 1 1 

OU MRRLNRKKTL SLVKELDAFP KVPESYVETS ASGGTVSLIA FTIHALLTZM EFSVYQDTWM 60 

KYEYEVDKDF SSKLRUflDI TVAMKCQYVG ADVLDLAETM VASADGLVYE PTVFDLSPQQ 120 

KEWQRMLQLI QSRLQEEHSL QDVIFKSAPK STSTALPPRE DDSSGSPNAC RIHGHLYVNK 180 

VAGNFHITVO RAIPHPRGHA HLAALVNKE5 YNFSHRIDHL SPGELVPAII NPLDGTEKIA 240 

IBHNQMFQYF ITWPTKLHT YKISAOTHQF SVTERERIIN HAAGSHGVSG IFMKYDLSSL 300 

65 MVTVTEEHMP PWQFFVRLOG IVGGIFSTTG MLHGIGKFIV EZIOCRFRLG SYKPVNSVPF 360 
EDGHTDNHLP LLENNTH 

SEQ ID NO-.223 PEZ3 DMA SEQUENCE 

70 Nudete Add Accession #: KM.001935.1 

Cocfing sequence: 76-2301 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

75 1 I | I I I 

CGCGCGTCTC CGCOGCCOGC GTGACTTCTG OCTGCGCTCC TTCTCTGAAC GCTCACTTCC 60 

GAGGAGACGC CGACGATGAA GACAOOGTGG AAGATTCITC TGGGACTGCT GGGTGCTGCT 120 

GCGCTTGTCA CCATCATCAC CGTGCOCGTG GTTCTGCTGA ACAAAGGCAC AGATGATGCT 180 

_ ACAGCTGACA GTCGCAAAAC TTACACTCTA ACTGATTACT TAAAAAATAC TTATAGACTG 240 

80 AAGTTATACT CCTTAAGATG GATTTCAGAT CATGAATATC TCTACAAACA AGAAAATAAT 300 
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ATCTTGGTAT TCAATGCTGA ATATGGAAAC AGCTCAGTTT TCTTGGAGAA CAGTACATTT 360 

GATGAGTTTG GACATTCTAT CAATGATTAT TCAATATCTC CTGATGGGCA GTTTATTCTC 420 

TTAGAATACA ACTACGTGAA GCAATGGAGG CATTCCTACA CAGCTTCATA TGACATTTAT 480 

GATTTAAATA AAAGGCAOCT GATTACAGAA GAGAGGATTC CAAACAACAC ACAGTGGGTC 540 

ACATGGTCAC CAGTGGGTCA TAAATTGGCA TATGTTTGGA ACAATQACAT TTATGTTAAA €00 

ATTGAACCAA ATTTACCAAG TTACAGAATC ACATGGACGG GGAAAGAAGA TATAATATAT 660 

AATGGAATAA CTGACTGGGT TTATGAAGAG GAAGTCTTCA GT6CCTACTC TGCTCTGTGG 720 

TGGTCTCCAA ACGGCACTTT TTTAGCATAT GCCCAATTTA ACGACACAGA AGTCCCACTT 780 

ATTGAATACT (XTTCTACTC TGATGAGTCA CTGCAGTACC CAAAGACTGT ACGGGTTCCA 840 

TATCCAAAGG CAGGAGCTGT GAATCCAACT GTAAAGTTCT TTGTTGTAAA TACAGACTCT 900 

CTCAGCTCAG TCACCAATGC AACTTCCATA CAAATCACTG CTCCTGCTTC TATGTTGATA 960 

GGGGATCACT ACTTGTGTGA TGTGACATGG GCAACACAAG AAAGAATTTC TTTGCAGTGG 1020 

CTCAGGAGGA TTCAGAACTA TTCGGTCATG GATATTTGTG ACTATGATGA ATCCAGTGGA 1080 

AGATGGAACT GCTTAGTGGC ACGGCAACAC ATTGAAATGA GTACTACTGG CTGGGTTGGA 1140 

AGATTTAGGC CTTCAGAACC TCATTTTACC CTTGATGGTA ATAGCTTCTA CAAGATCATC 1200 

AGCAATGAAG AAGGTTACAG ACACATTTGC TATTTCCAAA TAGATAAAAA AGACTGCACA 1260 

TTTATTACAA AAGGCACCTG GGAAGTCATC GGGATAGAAG CTCTAACCAG TGATTATCTA 1320 

TACTACATTA GTAATGAATA TAAAGGAATG CCAGGAGGAA GGAATCTTTA TAAAATCCAA 1380 

CTTATTGACT ATACAAAAGT GACATGCCTC AGTTGTGAGC TGAATCCGGA AAGGTGTCAG 1440 

TACTATTCTG TGTCATTCAG TAAAGAGGCG AAGTATTATC AGCTGAGATG TTCCGGTCCT 1500 

GGTCTGCCCC TCTATACTCT ACACAGCAGC GTGAATGATA AAGGGCTGAG AGTCCTGGAA 1560 

GACAATTCAG CTTTGGATAA AATGCTGCAG AATGTCCAGA TGCCCTCCAA AAAACTGGAC 1620 

TTCATTATTT TGAATGAAAC AAAATTTTGG TATCAGATGA TCTTGCCTCC TCATTTTGAT 1680 

AAATCCAAGA AATATCCTCT ACTATTAGAT GTGTATGCAG GCCCATGTAG TCAAAAAGCA 1740 

GACACTGTCT TCAGACTGAA CTGGGCCACT TACCTTGCAA GCACAGAAAA CATTATAGTA 1800 

GCTAGCTTTG ATGGCAGAGG AAGTGGTTAC CAAGGAGATA AGATCATGCA TGCAATCAAC 1860 

AGAAGACTGG GAACATTTGA AGTTGAAGAT CAAATTGAAG CAGCCAGACA ATTTTCAAAA 1920 

ATGGGATTTG TGGACAACAA ACGAATTGCA ATTTGGGGCT GGTCATATGG AGGGTACGTA 1980 

ACCTCAATGG TCCTGGGATC GGGAAGTGGC GTGTTCAAGT GTGGAATAGC CGTGGCGCCT 2040 

GTATCCCGGT GGGAGTACTA TGACTCAGTG TACACAGAAC GTTACATGGG TCTCCCAACT 2100 

CCAGAAGACA ACCTTGACCA TTACAGAAAT TCAACAGTCA TGAGCAGAGC TGAAAATTTT 2160 

AAACAAGTTG AGTACCTCCT TATTCATGGA ACAGCAGATG ATAACGTTCA CTTTCAGCAG 2220 

TCAGCTCAGA TCTCCAAAGC CCTGGTCGAT GTTGGAGTGG ATTTCCAGGC AATGTGGTAT 2280 

ACTQATGAAG ACCATGGA AT AGC TAGCAGC ACAGCACACC AACATATATA TACCCACATG 2340 

AGCCACTTCA TAAAACAATG TTTCTCTTTA CCTTAGCACC TCAAAATACC ATGCCATTTA 2400 

AAGCTTATTA AAACTCATTT TTGTTTTCAT TATCTCAAAA CTGCACTGTC AAGATGATGA 2460 

TGATCTTTAA AATACACACT CAAATCAAGA AACTTAAGGT TACCTTTGTT CCCAAATTTC 2520 

ATACCTATCA TCTTAAGTAG GGACTTCTGT CTTCACAACA GATTATTACC TTACAGAAGT 2580 

TTGAATTATC CGGTCGGGTT TTATTGTTTA AAATCATTTC TGCATCAGCT GCTGAAACAA 2640 

CAAATAGGAA TTGTTTTTAT GGAGGCTTTG CATAQA1TCC CTGAGCAGGA TTTTAATCTT 2700 

TTTCTAACTO GACTGGTTCA AATGTTGTTC TCTTCTTTAA AGGGATGGCA AGATGTGGGC 2760 

AGTGATGTCA CTAGGGCAGG GACAGGATAA GAGGGATTAG GGAGAGAAGA TAGCAGGGCA 2820 

TGGCTGGGAA COCAAGTCCA AGCATACCAA CACGAGCAGG CTACTGTCAG CTCCCCPCGG 2880 

AGAAGAGCTG TTCACCACGA GACTGGCACA GTTTTCTGAG AAAGACTATT CAAACAGTCT 2940 

CAGGAAATCA AATATCGAAA GCACTGACTT CTAAGTAAAC CACAGCAGTT GAAAGACTCC 3000 

AAAGAAATGT AAGGGAAACT GCCAGCAACG CAGCCCCCAG GTGCCAGTTA TGGCTATAGG 3060 

TGCTACAAAA ACACAGCAAG GGTGATGGGA AAGCATTGTA AATGTGCTTT TAAAAAAAAA 3120 

TACTGATGTT OCTAGTGAAA GAGGCAGCTT GAAACTGAGA TGTGAACACA TCAGCTTGCC 3180 

CTGTTAAAAG ATGAAAATAT TTGTATCACA AATCTTAACT TGAAGGAGTC CTTGCATCAA 3240 

TTTTTCTTAT TTCATTTCTT TGAGTGTCTT AATTAAAAGA ATATTTTAAC TTCCTTGGAC 3300 

TCATTTTAAA AAATGGAACA TAAAATACAA TGTTATGTAT TATTATTCCC ATTCTACATA 3360 
CTATGGAATT TCTCCCAGTC ATTTAATAAA TGTGCCTTCA TTTTTTC 



SEQ ID Nfrff* Pg? Prtf tin SflWTO; 

Protein Accession*: NP.001926.1 

1 11 21 31 41 51 

I I I II > 

HKTPWKIIiLG LLGAAALVTI ITVPWLLNK GTDDATADSR KTYTLTDYLK NTYRLKLYSL 60 

RWISDHEWiY KQENNILVFN AEYGNSSVPL ENSTPDEFGH SINDYSISPD GQFILLEYNY 120 

VKQWRHSYTA SYDIYDLNKR QLITEERIPN NTQWVTWSPV GHKLAYVWNN DIYVKIEPNL 180 

FSYRZTWTGK EDIIYNGITD WVYBEEVPSA YSALWWSPNG TFLAYAQFND TEVPLIEYSF 240 

YSDESLQYPK TVKVPYPKAG AVNPTVKFFV VNTDSLSSVT HATSIQITAP ASMLIGDHYL 300 

CDVTOATQER ISLQWLRRIQ NYSVMDICDY DESSGRWNCL VARQHIEMST TGWVGRFRPS 360 

EPHFTLDGNS FYKIISNEEG YRHXCYFQID KKUCTP1TKG TWEVIG1EAL TSDYLYYISN 420 

EYKGHPGGRN LYKIQLIDYT KVTCLSCELN PERCQYYSVS PSREAKYYQL RCSGPGLPLY 480 

TLHSSVNDKG LRVLEDNSAL DKMLQNVQMP SKKLDFIILN ETKPWYQMIL PPHFDKSKKY 540 

PLLLDVYAGP CSQKADTVFR LNWATYLAST ENIIVASFDG RGSGYQGDK1 MHAINRRLGT 600 

FEVEDQIEAA RQFSKMGFVD NKRIAIWGWS YGGYVTSMVL GSGSGVFKCG IAVAPVSRWE 660 

YYDSVYTERY HGLPTPEDNL DRYRNSTVMS RAENFKQVEY LLIHGTADDN VHPQQSAQIS 720 
KALVDVGVDF QAMWYTDEDH GIASSTAHQH IYTHMSHFIK GCPSLP 

SEQ (0 N0225 PBJ2 0NA SEQUENCE 

Nudde Add Accession #: none found 

Coding sequence: 1-261 {underlined sequences correspond to start and stop codons) 



1 U 21 31 41 51 

1 I I I I 1 
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ATG GCTCTGG CGAAGGTGAG GGAGCCAAAC GCAAATGACA ATGCCATCAG AGTTGACAAC 60 

AGAAGTGTGA TTAAAGTGCG T6CTAACCAG TGTTCCCTGC ATGAGGCAGA AAGTGAATCC 120 

AGAAACCCTC AGGAGCTCTG GATGGGCCTG CTCCTCTTGA TGGGGGTCCT AGAAGCATGT 180 

GTGGAAATGA GGCCTCTGTC AGTCTGGTCC CTGAGAGATG ACAAGGAOCA GAGCCCCCAC 240 
CAOCCCACAC TGGATGT CTA A 



NQigg PRE Prcfln sggvwff 

Protein Accession I: none found 

1 11 21 31 41 51 

I I I I I I 

MALAKVREPN ANDNAIBVBN R5VXKVRANQ CSLHBAESES RNPQELWKGL LLLKGVLEAC 60 
VEMRPLSVWS LRDDKEQSPH QPTLDV 

SEQ tD N02Z7 PBM2 DNA SEQUENCE 

Nucleic Add Accession*; none found 

Coding sequence: 1-462 (underlined sequences correspond to start an d stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGO CAAATG CTQAGTTAGA AGCAAAGAGC CTTGGAAGCA GTAAATGTTT AAAAACTGCT 60 

CTCATACTTQ CTGTATGTTG T6GATCA6CA AATATAGTCA GCCCTCTACT TGAGCAAAAT 120 

ATTGATGTAT CTTCTCAAGA TCTGGACAGA CGGCCAGAGA GTATGCTGTT TCTAGTCATC 180 

ATCATGTGGA CCAGTTTTGT GGAAGACAAT CTTTCCATGG 0CT666GGAA 6CTAGAAGAT 240 

TTTATQGCTA TTGAAGAAGA AATGAAGAAG CACGGAAGTA CTCATGTGGG ATTCCCAGAA 300 

AACCTGACTA AT66TGCCGC TGCTGGCAAT GGTGATGATG GATTAATTCC TCCAAGGAAG 360 

AGCAGAACAC CTGAAA60CA GCAATTTCCT GACACTGAGA ATGAAGAGTA TCACAGGTTT 420 
GTCAAAQATC A6ATAGTTGT AGATATQCGG CGTTATTTCT GA 

Protein Accession ft none found 



1 11 21 31 41 51 

11(111 

MPNASLEAKS LGSSKCLKTA LILAVCCGSA NIVSPLLEQN ICVSSQDLDR RPESMLFLVI 60 
IMWTSFVEDN L5MGWGKLED FMAIEKEKKK HGSTHVGFPE NLTNGAAAGN GDDGLIPFRK 120 
SRTPESQQFP DTENEBYHRF VKDQIWDMR RYF 

SEQ ID K0:229 PEZ2 DNA SEQUENCE 

Nucleic AcW Accession* NMJJ14253 

Coding sequence: 65-8242 (underlined sequences correspond to start and slop codons) 



1 11 21 

I ^ m I I 

GACTGCTTGC ATTAAAGGAC TTCCTCATCC 
AGAGATGGAG CAAACTGACT GCAAACCCTA 
GGATCTAGCT TACACCAGTT CTTCTGATGA 
ATACAACTCC AGGGAGACOC TGCACGAGTA 
CCAGAGTAGA AAGAGGAAAG AAGTAGAAAA 
CTCTCACACT CTGTGCTCTG GCTACCAAAC 
CCAGCTAGAG ATGGGATCTG ATGTGGACAC 
TCCACTAAGA ATGTGGATAA GGGGAATGAA 
GGOCAACICT GCATTATCCT TGACTGACAC 
TGGTTTCAAA TTCTCTCC T Q TTTGTTGTQA 
TGTGCAGAGC AGCCCACACA ACCAGTTCAC 
TCCTCATGCC TGCACCTGTG CCAGGAAGCC 
ATCAATGACT ACCOGCAGCC AGCCCAGCCC 
GGATTCAGTC CATCTGCATA ACAGCTGGGT 
GCATTCCCTG TTCAAACATG GATCTGGTTC 
CTACCCTCTG ACATCCAATA CCGTGTACTC 
CTTTTOCCGA CCTGCCTTTA CCTTTAACAA 
AGCATTGAGC GCCACTGCAA TCACAGTGAC 
AGTGCATTTG TTCGGCCTGA CTTGGCAGTT 
TGGAGTTAGC AAAGGGAACA GGGGGACCGA 
AGGAAAAGTT TCTGATAAAT CAGAGAAAAA 
TGGAGAAGTT GACATTGGTG CACAGGTCAT 
TTTCCAGATT ACTATCCACC ATCCAATATA 
CTCTCTGCTG GGAATTTATG GCAGAAGAAA 
TGTAAAACTA ATGGATGGCA AACAGCTGGT 
ACAGCACTCC OCTCGGAACC TGATCTTAAC 
TATGGATCAA GGAOCTTGGT ATCTGGCGTT 
ATTCGTGTTA ACTACAGCAA TTGAAATAAT 
TGGAGAGTGT ATCTCTGGOC ATTGTCATTG 
TAGAGATTOC TGCCCTGTGC TGTGTGGTGG 
CTGCCGGCAT GGCTGGAAGG GGCCAGAGTG 
AACATGCTTT GGOCACGGCA CCTGCATCAT 



31 4i 51 

1 I 1 

TTTTTTTCAT GAAACTGAGC TTGCTTAATC 60 

CCAGCCTCTA CCAAAAGTCA AGCATGAAAT 120 

GAGTGAAGAT GGAAGAAAAC CAAGACAGTC 180 

TAACCAGGAG CTGAGGATGA ATTACAATAG 240 

ATCTACTCAA GAGATGGAAT TCTGTGAAAC 300 

AGACATGCAC AGCGTTTCTC GGCATGGCTA 360 

AGAGACAGAA GGTGCTGCCT CACCTGACCA 420 

ATCAGAGCAT AGTTCCTGTT TGTCCAGCCG 480 

TGACCATGAA AGGAAGTCTG ATGGGGAAAA 540 

CATGGAGGCT CAAGCTGGGT CTACTCAAGA 600 

CTTCAGACCC CTCCCACCGC CACCTCCGCC 660 

ACCCCCTGCA GCGGACTCTC TTCAGAGGAG 720 

AGCTGCTCCA GCTCCCCCAA CCAGCACGCA 780 

CCTGAACAGC AACATACCAT TGGAGACCAG 840 

CTCTGCGATC TTCAGTGCAG CCAGTCAGAA 900 

GCCCCCTCCC AGGCCTCTTC CTCGAAGCAC 960 

ACCTTACAGG TGCTGCAACT GGAAGTGCAC 1020 

TTTGGCCTTG TTACTAGCCT ATGTGATTGC 1060 

GCAACCAGTT GAAGGAGAGC TGTATGCAAA 1140 

GTCCATGGAC ACTACTTACT CTCCAATTGG 1200 

AGTGTTTCAG AAGGGACGGG CGATAGACAC 1260 

GCAGACCATT OCACCTGGTT TATPCIGGC O 1320 

TCTGAAGTTC AATATTTCTT TAGCCAAGGA 1380 

CATTCCACCT ACACATACTC AGTTTGATTT 1440 

CAAGCAGGAC TCCAAGGGCT CTGATGATAC 1500 

TTCGCTTCAG GAGACAGGTT TCATAGAGTA 1560 

TTACAATGAT GGAAAAAAGA TGGAGCAAGT 1620 

GGATGACTGT TCAACCAATT GCAATGGAAA 1680 

TTTCCCAGGA TTCCTTGGAC CTGACT ST GC 1740 

GAATGGAGAA TACGAGAAAG GACACTGTGT 1800 

TGAC GTTCO G GAAGAACAAT GCATTGATCC 1860 

GGGAGTCTGC ATCTGTGTGC CAGGATACAA 1920 
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AGGAGAAATA TGCGAGGAAG AGGACTGOCT AGACCCAATG TGTTOCAACC ATGGCATCTG 1980 

TGTA AAAGGA GAATGTCACT GTTCTACTGG CTGGGGAGGA GTTAACTGTG AAACACCACT 2040 

TCCTGTATGT. CAAGAGCAGT GCTCAGGACA OGGAACTTTT CTTCTGGACG CTGGAGTATG 2100 

CAGCTGTGAT CCCAAGTGGA CAGGATCTGA CTGCTCAACA GAGCTGTOTA CCATGGAGTG 2160 

TGGTAGCCAT GGAGTCTGCT CAAGAGGAAT TTGCCAGTGT GAAGAAGGCT GGGTAGGACC 2220 

AACATGTGAG GAAOGCTCCT GTGATICTCA TTGTACTGAG CATGGCCAAT GCAAAGATGG 2280 

AAAATGTGAG TGTAGCCCTG GATGGGAGGG CGACCACTGC ACAATTGCTC ACTACTTAGA 2340 

TGCTGTCCGA GATGGCTGCC CAGGGCTCTG CTTTGGAAAT GGACGATGTA CCCTGGATCA 2400 

AAATGGTTGG CACTGTGTGT GTCAGGTGGG TTGGAGTGGG ACAGGCTGCA ATGTTGTCAT 2460 

GGAAATGCTT TGTGGAGATA ACTTGGACAA TGATGGAGAT GGTTTAAOCG ACTGTGTGGA 2520 

TCCTGACTGT TGTCAACAAA GCAACTGTTA TATAAGTCCT CTCTGCCAGG GCTCACCAGA 2580 

TCCTCTTGAC CTCATTCAGC AAAGCCAAAC TCTCTTCTCT CAGCACACTT CAAGACTTTT 2640 

TTATGATCGA ATCAAATTCC TCATTGGCAA GGACAGTACT CATGTCATTC CTOCTGAGGT 2700 

GTCATTTGAC AGCAGGCGTG CCTGTOTGAT TCGAGGCCAA GTGGTGGCCA TAGATGGAAC 2760 

TCCTCTAGTG GGAGTGAATG TCAGTTTCTT GCACCACAGT GATTATGGGT TTACCATCAG 2820 

CCGGCAAGAT GGAAGCTTTG ACCTC G TG OC CATCGGTGGC ATCTCTGTCA TCTTAATCT? 2880 

CGAGCGATCC CCTTTCCTGC CTGAGAAGAG AACACTCTGG TTGCCTTGGA ATCAGTTTAT 2940 

TGTGGTAQAG AAAGTCACCA TGCAGAGAGT TGTATCAGAC CCGCCATCCT GCGATATCTC 3000 

CAACTTTATC AGCCCAAACC CTATTGTGCT TCCTTCACCG CTCACATCAT TTGGAGGGTC 3060 

CTGTCCAGAG AGGGGAACTA TTGTTCCTGA GCTGCAGGTT GTACAGGAGG AAATTCCCAT 3120 

TCCCTCCAGC TTTGTGAGGC TGAGTTACCT GAGCAGCCGC ACCCCTGGGT ATAAAACCCT 3180 

GCTACGGATC CTTCTGACAC ATTCAACGAT TCCCGTAGGC ATGATAAAAG TACACCTCAC 3240 

AGTAGCTGTG GAAGGGCGAC TCACACAGAA GTGGTTTCCC GCCGCAATTA ATCTTGTCTA 3300 

CACATTTGCT TCGAACAAGA CCGATATCTA TGGACAGAAG GTT7GGGGCC TGGCAGAGGC 3360 

TTTGGTATCT GTGGGATATG AATATGAAAC GTGCCCTGAC TTTATTCTCT GGGAGCAAAG 3420 

GACAGTCGTT TTACAAGGTT TTGAGATGGA TGCTTCTAAC CTAGGAGACT GGTCTTTGAA 3480 

TAAGCATCAC ATTTTGAATC CTCAAAGTGG AATCATACAT AAAGGGAATG GAGAAAATAT 3540 

GTTCATTTCC CAGCAGCCCC CAGTCATATC AACCATAATG GGTAATGGAC ACCAAAGGAG 3600 

TCTAGCCTGC ACCAACTGCA ATGGCCCAGC CCACAACAAC AAACTCTTTG CTCCTGTCGC 3660 

CTTAGCTTCT GGCOCTGATG GCAGTGTGTA TGTTGGCGAC TTCAATTTTG TAAGGAGAAT 3720 

ATTTCCCTCG GGAAACTCCG TTAGTATTTT GGAATTAAGC ACAAGTCCTG CTCACAAATA 3780 

CTATCTGGCT ATGGACCCTG TQTCTGAATC ACTCTATCTA TCAGACACCA ATACTCGCAA 3840 

AGTCTACAAG TTGAAATCTC TTGTGGAGAC GAAAGATCTG TCCAAGAATT TTGAAGTGGT 3900 

GGCAGGAACT GGTGATCAGT GCCTTCCCTT TGACCAGAGT CATTGTGGAG ATGGTGGGAG 3960 

AGCA3CGGAA GCTTCACTGA ATAGCCCTCG AGGCATCACA GTTGATAGGC ATGGATTTAT 4020 

TTACTTTGTG GATGGGACTA TGATTCGCAA AATTGATGAG AATGCTGTGA TCACAACTGT 4080 

AATCGGCTCA AATGGTCTGA CTTCCACACA ACCACTGAGC TGTGACTCAG GAATGGACAT 4140 

CACTCAGGTG CGATTAGAGT GGCCAACAGA CCTTGCAGTA AATCCTATGG ACAATTCATT 4200 

GTATGTCTTG GATAACAACA TTGTGCTGCA AATTTCTGAG AACAGGCGTG TTCGGATCAT 4260 

CGCAGGACGC CCCATTCACT GCCAGGTGCC AGGCATCGAT CATTTCCTGG TCAGCAAGGT 4320 

AGCAATTCAC TCCACTCTAG AGTCAGCGAG GGCCATCAGT GTCTCCCACA GCGGGCTGCT 4380 

CTTCATAGCT GAAACAGACG AGAGGAAAGT AAACCGCATT CAGCAAGTAA CCACCAATGG 4440 

GGAGATCTAC ATCATCGCTG GTGCOCCCAC TGACTGTGAC TGCAAAATTG ATCCAAACTG 4500 

TGACTGTTTT TCAGGTGATG GTGGCTATGC CAAAGATGCA AAGATGAAAG CCCCTTOCTC 4560 

CTTAGCAGTG TCGCCTGATG GAACCCTCTA TGTGGCAGAC CTCGGAAATG TTCGAATTCG 4620 

TACCATCAGC AGGAACCAAG CCCACCTGAA TGACATGAAC ATTTATGAGA TTGCTTCACC 4680 

CGCTGATCAG GAACTGTACC AGTTCACTGT AAATGGAACC CACCTACACA CCCTGAACTT 4740 

GATAACAAGG GACTATGTTT ATAACTTCAC CTACAATTCT GAAGGTGACT TGGGCGCGAT 4800 

TACCAGCAGC AATGGCAATT CAGTGCACAT TOGCCGTGAT GCAGGOGGAA TGCCGCTATG 4860 

GCTTGTGGTG CCTGGCGGAC AAGTATACTG GCTGACTATA AGCAGCAATG GAGTCCTGAA 4920 

AAGAGTGTCA GCCCAAGGCT ATAATCCGGC CTTAATGACC TATCCAGGAA ACACAGGGCT 4980 

TCTGGCTACC AAAAGTAACG AAAATGGATG GACAACCGTT TATGAGTATG ACCCCGAGGG 5040 

ACAOCTGACC AATGCAAOGT TTCCCACTGG AGAGGTCAGC AGCTTCCACA GTGAOCTGGA 5100 

GAAGCTGACA AAAGTGGAGC TAGATACTTC CAACCGTGAA AATGTCCTCA TGTCAACCAA 5160 

CTTGACGGCA ACTAGTACCA TATATATTTT AAAACAAGAA AATACTCAAA GTACCTATCG 5220 

GGTGAATOCA GATGGTTGCC TGCGTGTCAC TTTTGCCAGC GGGATGGAGA TCGGCCTCAG 5280 

CTCAGAGCCC CACATCCTGG CAGGGGCAGT CAACCCTACC CTGGGCAAAT GCAACATCTC- 5340 

ATTGCCCGGA GAGCACAATG CAAACCTCAT CGAGTGGCGG CAGAGGAAGG AGCAAAACAA 5400 

AGGCAATGTT TCGGCTTTTG AAAGGAGGCT GAGGGCCCAC AACAGAAACC TACTCTCCAT 5460 

AGATTTTGAT CATATAACCC GCACAGGAAA GATCTATGAT GACCATCGAA AATTCACCCT 5520 

TCGAATTCTT TATGACCAGA CTGGGCGACC CATTCTGTGG TCTCCTOTAA GCAGATATAA 5580 

TGAAGTGAAC ATCACATATT CACCTTCGGG ATTGGTGACG TTTATTCAAA GAGGAACGTG '5640 

GAATGAAAAA ATGGAATATG ACCAGAGTGG GAAAATTATT TCAAGAACTT GGGCTGATGG 5700 

GAAAATTTGG AGCTATACCT ACTTAGAAAA ATCTGTGATG CTTCTCCTAC ACAGCCAGCG 5760 

GCGTTACATC TTTGAGTATG ACCAATCAGA TTGCCTGCTG TCAGTTACCA TGCCTAGCAT 5820 

GGTGCGCCAC AGCTTACAAA OCATGCTTTC AGTGGGCTAC TACCGTAATA TCTACACCCC 5880 

ACCGGACAGT AGCACTTCTT TTATCCAAGA CTATAGTCGA GATGGCCGAT TGCTACAGAC 5940 

CCTGCATCTG GGGACAGGGC GCAGAGTCTT ATACAAGTAC ACCAAGCAAG CAAGGCTTTC 6000 

TGAGGTTCTC TATGATACCA CTCAGGTCAC ATTAACATAT GAAGAGTCTT CTGGAGTGAT 6060 

TAAGACAATA CACCTGATGC ATGACGGATT CATCTGCACA ATCAGATACA GGCAAACAGG 6120 

ACCTCTTATT GGACGCCAGA TTTTCAGATT CAGTGAAGAA GGCCTTGTGA ATGCACGGTT 6180 

CGACTACAGC TACAACAATT TCCGAOTCAC AAGCATGCAA GCTGTAATCA ATGAAACCCC 6240 

TTTGCCTATA GATCTTTACC GATATGTTGA TGTCTCTGGC AGAACAGAGC AGTTTGGAAA 6300 

ATTCAGTGTA ATTAATTACG ATTTAAATCA GGTCATAACT ACTACAGTGA TGAAACACAC 6360 

CAAAATCTTC AGTGCCAATG GACAAGTCAT TGAAGTCCAA TATGAAATCC TAAAGGCAAT 6420 

TGOCTACTGG ATGACCATTC AATATGATAA TGTGGGCCGA CATGGTAATA TGTGCAIAAG 6480 

GGTAGGAGTA GATGOCAATA TAACAAGGTA CTTCTATGAA TACGATGCTG ATGGGCAACT 6540 

TCAGACTGTT TCTGTAAATG ACAAAACOCA GTGGCGTTAT AGTTACGATC TGAATGGAGA 6600 

CATCAACCTC TTAAGCCATG GGAAGAGTGC TOGTCTTACT CCTCTCCGAT ATGACCTCCO 6660 

AGAOCGCATC ACCAGAITAG GAGAAATTCA GTATAAAATG GATGAAGATG GCTTTCTGAG 6720 
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GCAGA GGGGA AATGATATTT TTQAATATAA TTCTAATGGC CTGCTGCAGA AAGCCTACAA 6780 
TAAGGCTTCT GGCTGGACTG TGCAGTATTA CTATGATGGG CTTGGGCGAC GTGTCGCGAG 6840 
TAAGTCCAGC CTAGGGCAGC ACCTTCAGTT CTTTGTCGAC GCGACCGCGA ACCCCATAAG 6900 
AGTTACTCAT TTGTACAACC ACACAAGCTC GGAGATTACA TCTCTGTATT ATGATCTCCA 6960 
A6GTCACCTT ATTGCCATGO AGTTAAGCAG TGGTGAAGAA TATTATGTAG CCTGTGATAA 7020 
TACAGGTACC CCACTAGCTG TGTTCAGCAG CCQAGGTCAG GTCATAAAGG AGATACTATA 7080 
CACACCTTAT GGCGATATCT ATCATGACAC TTACCCTGAC TTTCAGGTCA TAATTGGTTT 7140 
TCATGGAGGA CTCTATGATT TCCTTACTAA ATTAGTGCAC CT6GGGCAAA 66GATTATGA 7200 
TGTTGTTGCT GGCAGATGGA CAACGGOCTA TCATCACATA TGGAAACAGT TGAACCTCCT 7260 
TCCTAAACCA TTCAACCTCT ACTCCTTTGA AAATAACTAC CCAGTTGGCA AAATTCAAGA 7320 
TGTTGCAAAG TATACCACAG ACATCAGAAG TTGGTTGGAG CTATTTGGTT TCCAATTACA 7380 
CAATGTACTA CCTGGATTTC CCAAACCTGA ATTAGAAAAT TTAGAATTAA CTTACGAGCT 7440 
TCTACGGCTT CAQACAAAAA CTCAAGAGTG GGATCCTGGA AAGACTATCC TGGGCATTCA 7500 
GTGTQAACTC CAGAAACAGC TCAGGAATTT CATTTCCTTQ GACCAACTAC CTATQACTOC 7560 
CCGATACAAT GATGGACGGT GCCTTGAAGG AGGGAAGCAA CCAAGGTTTG CTGCTGTCCC 7620 
TTCTGTTTTT GGGAAAGGTA TAAAATTTCC CATCAAGGAT GGCATAGTAA CAGCTGATAT 7680 
TATAGGAGTA GCCAATGAAG ATAGCAGGCG GCTTGCTGOC ATTCTCAATA ATGCCCATTA 7740 
CCTGGAAAAC CTACATTTTA CCATAGAGGG GAGGGACACT CACTACTTCA TTAAGCTTGG 7800 
GTCTCTGGAG GAAGACCTGG TGCTCATCGG TAACACTGGG GGGAGGCGGA TTCTGGAGAA 7860 
TGGTGTCAAT GTCACTGTGT CCCAGATGAC TTCTCTGTTG AATGGGAGGA CTAGAOGGTT 7920 
TGCAGATATT CAGCTCCAGC ATGGAGCCCT GTGCTTCAAC ATCCGGTATG GGACAACTGT 7980 
CGAAGAGGAA AAGAATCACG TGTTGGAGAT TGCCAGACAG CGCGCAGTGG CCCAGGCCTG 8040 
GACTAAGGAA CAAAGAAGGC TGCAAGAGGG GGAAGAGGGG ATTAGGGCAT GGACAGAAGG 8100 
GGAAAAGCAG CAGCTTTTGA GCACTGGGCG GGTACAAGGT TACGATGGGT ATTTTGTTTT 8160 
GTCTGTTGAG CAGTATTTAG AACTTTCTGA CAGTGCCAAT AATATTCACT TTATGAGACA 8220 
GAGCGAAATA GGCAGGAG GT AAC AAAAATA TCTCTGCCTT TGCGTCACCA AAGACTGCCT 8280 
GTTTTTAAAA CATAAAATGG TTTATTGTAT TGGTTTTCTA GATCAGAACT CTGTATATGT 8340 
AAATATGGAG GAAAAACATA TCCAACTGCC TTTCAATGTG ACGGAAGATG GTATTTTAAT 8400 
ATTGTTTGTT TAAACTCTTT AAGAAATGAC AGAGATTTTT AGTTCTTGTG TGGCAGTATT 8460 
CAAAATAACA CAAGTAGAAC TCAAACAGCT AAAAACAGTT TTCAGAAAGC ACCACTTTCA 8520 
ASTTGCCGAG CCATGCATAT GTTCCAATAT CCAGAAAGAA CCCAAGGTTC TCTATCTCTA 8580 
TTGTGAGAAG CAGTTTCATC CTTAACTGTT GGCAGAACTT ACGGGCTATT TGAATAGGTG 8640 
GTGCAATAGT ATCTGAAACT TGCCTTTCGA AAGACTGCCA GCCCTTTGAC GTTTTCCAGA 8700 
TCTGTTATAG GAAACTTAAA AACAGGTGTA AAATGTCTTC AGCCACCATC TCCTAGAGTG 8760 
AGGACCCAAT TGCCCTTCCT TCTTGATTAT TCCTOCTTGC TTGTTAAAGT AAATGCCATA 8820 
T T GTOGTGCT GTGTTTTGGC GTGTGGTGGC TGGGTTCTGT CTACCATGCT TCOCTGTGGG 8880 
TGTGGTAACC AGACTGTATA GCCGCTATTT GCTCGTGTGT ACATGATACC AAAGCAGCTG 8940 
GCCAGCGTGA CCTCTCTCAC ACGACCTGTT TTGACTCAAT TTTTTACTAA AAGTTGTTCA 9000 
GCTGTATTGG TATCATGTAA ACATAGCTTT TATTAACCTG GGTAGGAATT TCTCATTTAT 9060 
ATATAGGATG TGTTTTGGTC ATAGTTTCAC ATTAGTGATT CAGTATCTAT ACACTGACCC 9120 
AATGGTTTTG TGCACATGAA CGGTAATTTA CTTAAAAGTA TGATTCTGGT ACAAAAACAA 9180 
ACAAAGGCTT TAGCAGGCAT ACGTGTCTGG GATGOCGATA CATACATTAA CTACTACTGC 9240 
AGAAATTCAT AAGAGCCAAA ACCTTAAAAA AATAGACCTG GTACTTAAGT GAAAGTACTA 9300 
AAGGGAAGAC CAGACCAAAC ATCACAGCAG TTGCTGCCAC ATTGTTTCAG CCCACTTAGA 9360 
TTTATCTTTC AAATGTACAA TTCTGTATTG AACATCTCCC AGCCATCTTC AGGAAATOGA 9420 
ATCAAGTAAA TCCTTTCCAA CCGAAAACAT TTCAACTAAC TATAGAGAGG CAGACTCATT 9480 
TTTACTAAAA TAATTTATAC AGTTAGTTAT y i TOOT L T t! CGTACTTACC CATTTATCTT 9540 
TATTTAATCG TCTCTACTGC CTAGGAAAAT AACTATTTTC CAGGACGGGT TATTTOTTCT 9600 
GCGATCATTT AAAATTTGGA GAAAGGTCAG GATTAGTGTT AATATCAGCT GCAGTTTCTC 9660 
AATCTCTAGG AAT CCTG CAG TAAAACAAGC CCCTTGGTGA GCTGGAAGAT TTGTGCCCAG 9720 
TGACAAAGAG ATAGTITGTA AAATGCTGTG TAATTGTAAG TTACCACAAA TGAAAATACA 9780 
TGACAGCACA ATGTGGCCCG TAGAAAATTC CCCTGAGCCA GCTTCTGCAC TTTCATCACC 9840 
GAATCTGAAC ATTTGCTATG TCTGAAGGCA AATTTATGAT GGAATGTTAG TTTGGATTCT 9900 
TTCCAGATGC TACCTAAATG CAGTGTGGGG TCATTGCCTT GCTTTGCGAT GACAGTTTCT 9960 
TTGAAAATAT GCAAAGTCAT AAGCTCATGT TAAGGTTTTT CAAGAGTCTG CCTCCTACTA 10020 
CACAAAGGAA AGCAAGGGAA AGGAAATGAC CCTGGCAAAC AGTAGGGAAG GGTGTATTCA 10080 
AACATTTCAT TTTCAAAACC TTCGGGTTAG AATACCACTT ACACATGTAT TCTGAGAGAC 10140 
AGAATTCATG AGGAACTCAT CTCTCTTTAT AACTGGAAAC ACACCAGCTT GATATATTGC 10200 
TAATCCATAC TAAAATCATA TTATTGGGTT TTTTCTGAAT CAGGCCTGTA TTAATGGTAC 10260 
AGTATTTATT CAGAATGGAA TTCTAAAATT ACTAACAAAC TTGTTGAAAA TTTGAATACC 10320 
TCCACACCAA CCTAAAAATG GACCTTAAGT TCCTAGAACC TCTGATGTTC TTTTAAATTA 10380 
ATGGAAAAAT AATTTGTGAA CTGTATATAG AGAGTGCATT CATAAATGTG ATTATGTATT 10440 
TTATCACAAA TCCAAAATGT CAATATTAGA GTCTATTTTG CTTATATTTT AAGCAATTAT 10500 
ACGTTTTTGC AATTCATTGA TGATGTATCA TTTTCAAACT GCTTTAAATA TCCATTAGAA 10560 
ACAAATATTT GAAGCTTTTA CTTAATAGTG ATTACCTTGA ACTGTGCATT TCTAGTTTOT 10620 
AATACGTATT TGGTTGGTTC GTGCCTTTAG TTTGTTAAAG TTACATTTGT ATTATATTCA 106 B0 
GGAAATGCAC TTTTTATTAC TTACAGCTGT GGTTTTAATA CTGCCTTGAA CTATTATTAT 10740 
TCTTTTTACA ACTCCTAAAG CTTGAGGGAG GAAAGAAAAA AAAAACAAAA CTACTAATCA 10800 
GTAGTAAATC GAAGAGAAAC ATTTTGGCAT TTCTTAAGAA GAAGATGGAG ATATTGAGTA 10860 
TATCACTTCC TATTCAGCTG AATAGAAAGA ATGCCTTCAT TGACTTGCAG TTCTGCAGTT 10920 
TAAATTATTG AAAGAACAAT TCGTTTGCAT TTCCTGATGA AAGTAAAAGC ATTTTTCAGA 10980 
GAAACATATQ AATTTCTCAT ACCCAGCAGA CAGATGGCTG ACACTGCACA GCCACACACC 11040 
ATTCGAGTAA GTTAAAGTGA GAGCATAGTA GTTGGACTCT CCTATGAAGA ACATTCTGGG 11100 
CTGGAGGCAG GGAATACTCC ATGGTTGTTT CTTTTTCCTA CTTAAGCCCA TTTTGCTTGT 11160 
GCTTTTCTGT TTTGTTTTGT TTTCACTCTT GCACTACAGT CTAGAGATCC AAATGAACTG 11220 
AAAAGTTCAA AGTTTAACAC ATTTAAATAT GTTTACTTTT AGTTGTCATT CTAATCGTTA 11280 
TTGA3TAGAA GCATGACTCC TGAAGGAAAG GGAAATAAAT CTCAATTCAT ACTAACTTGC 11340 
AACAAAACAC TTTTACCATA TAAATAAGTA TATGATTTAT TTTTAACCCA AAAAATGTAT 11400 
AAAATAAGTG TGTCCTPTAC TGTCAATTTA TCGAGAAGAT CTATAATATA TAGACTACAT 11460 
ATATATAATA TATACAACAT AGCCAAATGT ATGAAAACTT GACAATGTAT AATTTGGAAT 11520 
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TCACATGCTA CCTATGTAGA CAGGTATGAA ATTAAGTTAT AATTTTCATG AGACATTTTC 11580 
ATCACTGTTG ACACAGTTTC AAGGCATTCC ATCATCTTAT TTTGACTCTT VfTClTlTTT 11640 
TTTTCTTTAA AAATATAT1T TTAACTAGAC CAGGCCCCAC TATAATATCA CTTAAGAGAG 11700 
TCAGGGCAAA GTTTTTGCAT TTATGAAGAT GTGTTCATGT AAGGGTGATT GTAATGGAGT 11760 
TCATTGGTAA TAGAAGCAAA AGTACAGTAA CGAAGTATTO AAAAGAAAAT TTTGGAGACA 11620 
TTGGAGCAXA TTATATATAO CTTGTGQAAA GACATAAGGC TACAGATGGA ATGGAACATT 11880 
OCTOTTTTCT TGAAGAAATT CACATACACA TAGCTGACCT GACTAGTACT TCAGCTCTTC 11940 
CACAGCCTTC TATAAAGGTT C TTT CT T C T O CAAAGAAAAC AAAACAAAAC AAAACAAAAC 12000 
AAAAAAAAAC AAAAAAAGCG CAAAAAACAA AAAAACAAAA AAAAGCAAAG TAAAATTTAA 12060 
AAATACAGAA AACAAACAAC AAAAAAGAAT TCAACCATAA ATAGTGACTA TTATTTTCAG 12120 
TOTGTCCTTC ATGTGAAAGC TATTAAGGAC CAAATATACT ACTGTTCATA AGAAGAAATT 12180 
ACTTTCTAAA CAGTAACTGA AAATACTTAG AGTTAAACTT GCTGTGGATT TTGTCTTCGC 12240 
AGTTGTCATC TTACATTATT TGTCAAAGGA AATGTGTTTG GCAGTTAAAA ATCTTTCCTT 12300 
AGATTTAGTG GTGGACTTTA ACCTCTTAAA TAAATGTTAG TATATCAGAT TGTGTCCTTG 12360 
AAAAATATTT TACTTGTATG AATCATGACA ACGTCTAAAT CTTTACTATT CTTCTGGCAA 12420 
AAGCATCAGT AAGAAAGAAG GCGAAAAAGA GAAGTATAGC CTTTATGTCA GAAAAACATT 12480 
CTTTTTAGCT GCTTACTTTC TCATGAAAAG TAAAGATGTT TACAGTGTAT GCCAAGTTTT 12540 
CAGTTTCTGT ATAACAACAG GTAGAGGTTC TAATCATATT GAAAATTGTG TTATAATGGT 12600 
CTGAGCCATG TTGCTAGGAA ACAATAGGTT CCAATTTTGT ATTCCTGCTC TCCTGTGCTQ 12660 
AAAAGTGACT GGATACTGTA CAGGTTCATG TTCTCTGGCT GCAGTTAAAT GGTCTTTTGC 12720 
ATTTTGCTCT GGCTTTCAGG CCAGAAGCAT GCATTTTTCT ACAAGAGCAT CACAACAACA 12780 
TGCTGTAAAT ATTTAAAGTT AAACATTATO TGTTGATATT TGAAAGAAAA GTACTTTGAA 12840 
TATTTCATTT TTAAAAAATA AAATTGCCAA TGAAAAAAAA 



SEQ P N0330 PEZ2 Protein fieouencg 
Proteto Accession* NP.055068 



1 U 21 31 41 51 

I I I I I I 

MEQTDCKPYQ PLFKVKHEHO LAYTSSSDES EDGRKPRQSY NSEETLHEYN GELHMNYNSQ 60 

SRKRKEVEKS TQEMEFCETS HTLCSGYQTD MHSVSRHGYQ LEMGSDVDTE TEGAASPDHA 120 

LRMWIRGHKS EHSSCLSSRA NSALSLTDTD HERKSDGENG FKFSFVCCDM EAQAGSTQDV 180 

QSSPHNQPTP RPLPPPPPPP HACTCARKPP PAADSLQRRS MTTRSQPSPA APAPPTSTQD 240 

SVHLHNSWVL NSNIPLETRH SLPKHGS6SS AIFSAASQNY PLTSNTVYSP PPRPLPRSTF 300 

SRPAFTFNKP YRCCNWKCTA LSATAITVTL ALLLAYVIAV HLPGLTWQLQ FVEGELYANG 360 

VSKGNRGTES MDTTYSPIGG KVSDKSEKKV FQKGRAIDTG EVDIGAQVMQ TIPPGLFWRF 420 

QITIHHPIYIi KFNISLRKDS LLGIYGRRNI PPTHTQFDFV KLMDGKQLVK QDSKGSDDTQ 480 

HSPR NULT S LQETGFTEYM DQGPWYLAPY NDGKKKEQVP VLTTAIED4D DCSTNCNGNQ 540 

ECISGHCHCF PGFLGPDCAR DSCPVLCGGN GEYEKGHCVC RHGKKGPECD VPEEQCIDPT 600 

CFGHGTCIMG VCICVPGYKG EICEEEDCLD PHCSNHGICV KGECHCSTGW GGVNCETPLP 660 

VCQBQCSGHG TFLLDAGVCS CDPKWTGSDC STBLCTMECG SHGVCSRGIC QCEEGWVGPT 720 

CEERSCHSHC TEHGQCKDGK CECSPGWEGD HCTIAHYLDA VRDGCPGLCF GNGRCTLDQN 780 

GWHCVCQVGW SGTGCNWME MLCGBNLDND GDGI/TDCVDP DCCQQSNCYT SPLCQGSPDP 840 

LDLIQQSQTL FSQHTSRLFY DRIKFLIGKD STHVIPPEVS FDSRRACVIR GQWAIDGTP 900 

LVGVNVSFLH HSDYGFTISR QDGSPDLVAI GGISVILIFD RSPFLPEKRT LWLFWNQFIV 960 

VEKVTHQRW SDPPSCDISN FISFNPIVLP SPLTSFGGSC PERGTIVPEL QWQEEIPIP 1020 

SSFVRLSYLS SRTPGYKTLL RILLTESTIP VOMIKVHLTV AVEGRLTQKW FPAAINLVYT 1080 

PAWNKTDIYG QKVWGLAEAL VSVGYEYETC PDPILWEQRT WLQGFEMDA SNLGDWSLNK 1140 

HHILNPQSGI IHKGNGENMF ISQQPFVTST IKGNGHQRSV ACTNCNGPAH NNKLFAPVAL 1200 

ASGPDGSVYV GDFNFVRRIF PSGNSVSILE LSTSPAHRYY LAMDPVSESL YLSDTNTRKV 1260 

YKLKSLVETK DLSKNFEWA GTGDQCLPFD QSHCGDGGRA SEASLNSPRG ITVDRHGFIY 1320 

FVDGTMIRKI DENAVZTTVI GSNGLTSTQP LSCDSGMDIT QVRLEWPTDL AVNPMDNSLY 1380 

VLDNNXVLQI SENRRVRIIA GRPIHCQVPG IDHFLVSKVA IRSTLESARA ISVSHSGLLF 1440 

IAETDERKVN RIQQVTTNGB IYIIAGAPTD CDCXIDPNCD CFSGDGGYAX DAKMKAFSSL 1500 

AVSPDGTLYV ADIGNVRIRT ISRNQAHLND MNTYEIASPA DQELYQPTVN GTHLHTLNLI 1560 

TRDYVYNFTY NSEGDLGAIT SSNGNSVHIR RDAGGKPLWL WPGGQVYWL TISSNGVLKR 1620 

VSAQGYKPAL MTYPGNTGLL ATKSNENGWT TVYEYDPEGH LTNATFPTGE VSSPHSDLEK 1680 

LTKVELDTSN RENVLHSTNL TATSTIYTLK QENTQSTYRV NPDGSLKVTF ASGHEIGLSS 1740 

BPHILAGAVN PTLGKCNISL PGEHNANLIE KRQRKEQNKG NVSAFERRLR ABNRNLLSID 1800 

PDEITRTGKI YDDHRKFTLR ILYDQTGRPI LWSPVSRYNE VNITYSPSGL VTFIQRGTWN 1860 

EKKEYDQSGK IISRTWADGK IWSYTYLEKS VKLLLHSQRR YTFEYDQSDC LLSVTMPSHV 1920 

RHSLOTMLSV GYYRNIYTPP DSSTSPIQDY SRDGRLLQTL HLGTGRRVLY KYTKQARLSE 1980 

VLYDTTCVTL TYEESSGVIK TUTLMEDGFI CTIRYRQTGP LIGRQIFRFS EEGLVNARFD 2040 

YSYKNFRVTS MQAVINETPL PIDLYRYVDV SGRTEQFGKF SVIKYDI2IQV ITTTVMXHTK 2100 

ZFSAKGQVZE VQYEILKAIA YWMTIQYDNV GRHGNMCIRV GVDANITRYF YEYDADGQLQ 2160 

TVSVNDKTQW RYSYDLNGDI NLLSHGKSAR LTPLRYDLRD RITRLGEIQY KMDEDGFLRQ 2220 

RGNDIFEYNS NGLLQKAYNK ASGWTVQYYY DGLGRRVASK SSLGQHLQFF VDATANPIRV 2280 

THL YN HTSSS ITSLYYDLQG HLTAMFLSSG EEYYVACDNT GTPLAVFSSR GQVIKEILYT 2340 

PYGDIYHDTY PDFQVIIGFH GGLYDFLTKL VHLGORDYDV VAGRWTTAYB HIWKQLWLL? 2400 

KPFNLYSFEN NYPVGKIQDV AKYTTDIRSW LELFGFQLHN VLPGFPKPBL ENLBLTYELL 2460 

RLQTKTQEWD PGKTILGIQC BLQRQLRNFZ SLDQLPMTPR YNDGRCLEGG KQPRFAAVPS 2520 

VFGKGIKFAI KDGTVTADII GVANEDSRRL AAILNNAHYL QilflFTIEGR DXHYPIKLGS 2580 

LEEDLVLIGN TGGRRILENG VNVTVSQMTS LLNGRTRRFA DZQLQBGALC FNIRYGTTVE 2640 

EEKNHVT.ETA RQRAVAQAWT KEQRRLQBGE EGXRAWTEGE KQQLLSTGRV QGYDGYFVLS 2700 
VBQYLBLSDS ANNIHFMRQS BIGRR 

SEQ ID N0d31 PFD4DNA SEQUENCE 

Nudeb: Add Accession*: »flA_00044l 
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Coting sequence: 



2253567 (underlined sequences correspond to start and stop codons) 



11 



21 



31 



41 



51 



I 



CTCAGCCTTC CCGGTTCGGG AAAGGGGAAG AATGCAGGAG GGGTAGGATT TCTTTCCTGA 60 

TAGGATCGGT TGGGAAAGAC CGCAGCCTGT GTGTGTCTTT CCCTTCGACC AAGGTGTCTG 120 

TTGCTCCGTA AATAAAAOGT CCCACTGCCT TCTGAGAGCG CTATAAAGGC AGCGGAAGGG 180 

TAGTCCGCGQ GGCATTCCGG GCGGGGCGCQ AGCAGAOACA GGTCATGGCA GCGCCAGGCG 240 

6CAGGTCGGA GCCGCCGCAfi CTCCCCGAGT ACAGCTGCAG CTACATGGTG TCGCGGCCGG 300 

TCTACAGCGA GCTOGCTTTC CAGCAACAGC ACGAGCGGCG CCTGCAGGAG CGCAAGACGC 360 

TGCGGGAGAG CCTGGCCAAG TGCTGCAGTT GTTCAAGAAA GAGAGCCTTT GGTGTGCTAA 420 

AGACTCTTGT GCCCATCTTG GAGTGGCTCC CCAAATACCG AGTCAAGGAA TGGCTGCTTA 480 

GTGACGTCAT TTCGGGAGTT AGTACTGGGC TAGTGGCCAC GCTGCAAGGG ATGGCATATG 540 

CCCTACTAGC TGCAGTTCCT GTCGGATATG GTCTCTACTC TGCTTTTTTC CCTATCCTGA 600 

CATACTTTAT CTTTGGAACA TCAAGACATA TCTCAGTTGG ACCTTTTCCA GTGGTGAGTT 660 

TAATGGTGGG ATCTGTTGTT CTGAGCATGG CCCCCGACGA ACACTTTCTC GTATCCAGCA 720 

GCAATGGAAC TGTATTAAAT ACTACTATGA TAGACACTGC AGCTAGAGAT AGAGCTAGAG 780 

TCCTGATTGC CAGTGCCCTG ACTCTGCTGG TTGGAATTAT ACAGTTGATA TTTGGTGGCT 840 

TQCAGATTGG ATTCATAGTG AGGTACTTGG CAGATCCTTT GGTTGGTGGC TTCACAACAG 900 

CTGCTGCCTT CCAAOTGCTQ GTCTCACAGC TAAAGATTGT CCTCAATGTT TCAACCAAAA 960 

ACTACAATGG AGTTCTCTCT ATTATCTATA CGCTGGTTGA GATTTTTCAA AATATTGGTG 1020 

ATACCAATCT TGCTGATTTC ACTGCTGGAT TGCTCACCAT TGTCGTCTGT ATGGCAGTTA 1080 

AGGAATTAAA TGATCGGTTT AGACACAAAA TCCCAGTCCC TATTCCTATA GAAGTAATTG 1140 

TGACGATAAT TGCTACTGCC ATTTCATATG GAGCCAACGT GGAAAAAAAT TACAATGCTG 1200 

GCATTGTTAA ATCCATCCCA AGGGGGTTTT TGCCTCCTGA ACTTCCACCT GTGAGCTTGT 1260 

TCTCGGAGAT GCfTGGCTGCA TCATTTTCCA TCGCTGTGGT GGCTTATGCT ATTGCAGTGT 1320 

CAGTAGGAAA AGTATATGCC ACCAAGTATG ATTACACCAT CGATGGGAAC CAGGAATTCA 1380 

TTGCCTTTGG GATCAGCAAC ATCTTCTCAG GATTCTTCTC T TO CTTT tf Jtf GCCACCACTG 1440 

CTCTTTCCCG CACGGCCGTC CAGGAGAGCA CTGGAGGAAA GACACAGGTT GCTGGCATCA 1500 

TCTCTG C TGC GATTGTGATG ATCGCCATTC TTGOCCTGGG GAAGCTTCTG GAACCCTTGC 1560 

AGAAGTCGGT CTTGGCAGCT GTTGTAATTG CCAACCTGAA AGGGATGTTT ATGCAGCTGT 1620 

GTGACATTCC TCGTCTGTGG AGACAGAATA AGATTGATGC TGTTATCTGG GTGTTTACGT 1680 

GTATAGTGTC CATCATTCTG GGGCTGGATC TCGGTTCACT AGCTGGCCTT ATATTTGGAC 1740 

TGTTGACTGT GGTCCTGAGA GTTCAGTTTC CTTCTTGGAA TGGCCTTGGA AGCATCCCTA 1800 

GCACAGAIAT C TACA AAAGT ACC AAGAA TT ACAAAAACAT TGAAGAACCT CAAGGAGTGA 1860 

AGATTCTTAG ATTTTCCAGT CCTATTTTCT ATGGCAATGT CGATGGTTTT AAAAAATGTA 1920 

TCAAGTCCAC AGTTGGATTT GATGCCATTA GAGTATATAA TAAGAGGCTG AAAGCGCTGA 1980 

GGAAAATACA GAAACTAATA AAAAGTGGAC AATTAAGAGC AACAAAGAAT GGCATCATAA 2040 

GTGATGCTGT TTCAACAAAT AATGCTTTTG AGCCTGATGA GGATATTGAA GATCTGGAGG 2100 

AACTTGATAT CCCAACCAAG GAAATAGAGA TTCAAGTGGA TTGGAACTCT GAGCTTCCAQ 2160 

TCAAAGTGAA CGTTCCCAAA GTGCCAATCC ATAGCCTTGT GCTTGACTGT GGAGCTATAT 2220 

CTTTCCTGGA CGTTGTTGGA GTGAGATCAC TGCGGGTGAT TGTCAAAGAA TTCCAAAGAA 2280 

TTGATGTGAA TGTGTATTTT GCATCACTTC AAGATTATGT GATAGAAAAG CTGGAGCAAT 2340 

GCGGGTTCTT TGACGACAAC ATTAGAAAGG ACACATTCTT TTTGACGGTC CATGATGCTA 2400 

TACTCTATCT ACAGAACCAA GTGAAATCTC AAGAGGGTCA AGGTTCCATT TTAGAAACGA 2460 

TCACTCTCAT TCAGGATTGT AAAGATACCC TTGAATTAAT AGAAACAGAG CTGACGGAAG 2520 

AAGAACTTGA TGTCCAGGAT GAGGCTAJTGC GTACACTTGC ATCCTGAAAG TGGGTTCGGG 2580 

AGGTCTCTAT GAGCAAGGAA TACAAGACAA AACTTCCTCA ATGCATTGAC TATTTCTTCA 2640 

GACTCAAAAC ACTCATTCTT TTTTCTATTA AGCCATTGAA AGAGAAGCAC TAAGACTGCT 2700 

TCTAGGCTTT ATTTATAAAA TAAACACCTT ATCCCTAACA TGGGCAAAAT GGCTAGAATT 2760 

ATTCAGACGA TTTGGCAGCG TCCAGGGTAA GCTGGTGTTA TAATACGCTG CTGATCTACA 2820 

TCACAGATTT GCTAATAATO TTCACGTGGG CCCTGGCATA TCTCTGTTCA GTTAGAGTGA 2880 

GTGCTGACCC AACAGCCTCT GTGGTCAAGC GAGTCACGAA TGATTAATCA TAAAGAAAAA 2940 

TCAGTTTTTG ACTGACCTGG ATATCCATGA GCTGCACTGA TCACCATGTA AGGTCACATT 3000 

TAGTAAATGC TGAAATAAAA TGATTAATGC ATTTATCAAT AAAAGCCTTT GAAAATACTT 3060 

TGGATAATAA ATTGGAGTTT TAAAAATGCA AATTTGCTTA GTATCTAATA ATGAAGTGTT 3120 

ATTACATATA GCCGGAATTG AGGATCTCTT TGATCCTGGA AATGGTTTAC CTAAAAGCTA 3180 

CAGAACCAGG CCAATATATT TTGAAATATT GATGCAGACA AATGAAATAA TAAAGAGATT 3240 

TTCATGGTTT ATAAAAATCT TTTTTGATAT GATAATAATC ATGATCACAA CTGAGATCAA 3300 

AAAAATATAT GACAGATTAT TTTGTTTAAA AATGCAGTTT TAATTATCTT AGTCTATAGA 3360 

AATGATCATT GCATGGAGGC ATGTATAGGT ATGATCTGTG TAAAATCTGA CATAAAAACA 3420 

GTGCTATTCT GAGTGAAAAT TTTTTTGATG TGCTTACATA ACCATGGTGA TTAAAATGAG 3480 

TTTATATTTT TTCTCAAAAA TTTTAGCAGT GTGTAAAGTA AGTAATCTTT AACTGAACTC 3540 

TGACCACTTA AAAAAAAATC TAAAAATTGA ACTACCTATA GTAGTCTGTG TTTAAAGTGA 3600 

ATTTTTAAAG ACAAAGCATT CTAAATGAAC TCAATATAAA AACATTCATT TGGAATGTAC 3660 

ATACTGAAAA ATACAGGTTT TTTTGACCAA AAGTTTTTAT ATCTTTTCTT TTTATTTATT 3720 

WITCCTAA GTGCCAACAA TTTTCTAGAT ATTATATACA ACACAGGCTT TGATCTTGGG 3780 

GACTTTTCCC ATATATTTCA CACTGGAGTG AATGAAGTTG TACTTCATTT CTAGAGAAAA 3840 

GTTATACCCA GGTCCCCAAT TGAGAATGTC TTGCTTGATT GAAAACGACA TCATCCCTTG 3900 

GTATACTCCA GGGATTGGTT TCAGGACCCC TGCATTTACC AAAATTTGTG CACACTCAAG 3960 

TCCTGCAGTC ACCCCTGCCT AAAGATAGAA TGGCTTCTCT GTTTTTCTTC TGAAATACAA 4020 

CCAGAAACAA TGTGTCTATT TCTGAAAGAA TAGGATTAAT GATCATACAA ATGGGTTAAT 4080 

CCTGAATTCT GGTTGTAAAT CTGGTTACAG CATAACTAGG ATTATAATGC TGCCTCATTT 4140 

TCACAGCACT ACTTGCTTAT ATTGACAACA AATCATCTCG CTAAAGAGTG AATGTAGGCC 4200 

AGGCGCGGTG GCTCATGCCT GTAATCCCAG CACTTTGGGA GGCCGAGGCG GGTGGATCAC 4260 

GAGGTCAGGA GATCGAGACC ATCCTGGCTA ACATGGTAAA ACCCCGTCTC TACTAAAAAT 4320 

AGAAAAAAAG AAATTAGCCT AGCGTGGTGG CTGGCGGGCG CCTGTAGTCC CAGCTATTTG 4380 

GGAGGCTAAG GCAGGAGAAT GGCGTGAACC CGGGAGGCGG AGCTTGCAGT GAGCCGAGGT 4440 

CGTGCCACTG CACTCCAGCC TGGGOGACAG AGCAAGACTC CGTCTCAAAA AAAAAAAAAA 4500 
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AAAAAAAAAA AGAGTGAATG TAATAGTCTT GCAGAAAATG AATQAATACC TTTGTTCAAT 4560 

AAAGQAAATA TGCACTGCTC ACTTTTTTGA AGGAAATGCC AAAGTTACGT TTTACAACAA 4620 

GCCTAGAGTT TGTAAATTCT GGGTTCATTT GTGATGACAT AAGTCAGCAA ACTCCGGGAA 4680 

TACTGTCTCT TCTATGTATT TTGTGAATAG TAAGCATAAT TTTAOTTTTG TATTATCAAT 4740 

GAAAATTTCA CTTQAAATTA AAGCTGCCTT TTOTTATATT TTEAACCTAT AGGATAAGAT 4800 

TCCAGTATTG TATATGAGTT TTAACAAATT AAAAAATCAA ATCATGTACA TTTGAAAATA 4860 

TTTGCACACA TTTAAAAATA AATGTAAAGT TGTCTTTTAA ACTACTCGGA TGTGTCCTTT 4920 
CTGAACAAAA 



SEQ P N0432 PFD4 Protein samara 

Protein Accession* 043511 



1 11 21 31 41 51 

I 1 I ! I I 

MAAPGGRSEP PQLPEYSCSY MVSRPVYSEL AFQQQHERRL QERKTLRESL AKCCSCSRKR 60 

AFGVLKTLVP ILEWLPKYRV KEWLLSDVTS GVSTGLVATL QGMAYALLAA VFVGYGLYSA 120 

PPPILTOPIF GTSRHISVGP PPWSLMVGS WLSKAPDEH PLVSSSNGTV LNTTMIDTAA 180 

RDTARVLIAS ALTLLVGIIQ LIFGGLQIGF IVRYLADPLV GGFTTAAAPQ VLVSQLKXVL 240 

NVSTKNYNGV LSIIYTLVEI FQNIGDTNLA DFTAGLLTIV VCMAVKELND RFRHKIPVPI 300 

PIEVIVTIIA TAISYQANLB KNYNAGIVKS IPRGFLPPEL PPVSLPSEML AASFSIAWA 360 

YAIAVSVGKV YATKYTJYTID GNQEPIAFGI SNIFSGFFSC PVATTALSRT AVQESTGGKT 420 

QVAGIISAAI VMIAIIALGK LLEPLQKSVL AAWIANLKG MFMQLCDIPR LWRQNKIDAV 480 

IWVFTCIVSI ILGLDLGLLA GLIFGLLTW LKVQFPSWNQ LGSIPSTDIY KSTKNYKNIE 540 

BPQGVKILRF SSPIFYGNVD GFRKCXKSTV GFDAIRVYNK RLKALRXIQK LXKSGQLRAT 600 

KNGIISDAVS TNNAFEPDED IKDLEELDIP TKEIEIQVDW NSELPVKVNV PKVPIHSLVL 660 

DCGAiSPLDV VGVRSLRVIV KEPQRIDVNV YFASLQDYVI BKLEQCGPFD DNXRKDTFFL 720 

TVHDAILYLQ NQVKSQEGQG SILETITLIQ DCKDTLELIE TELTEEELDV QDEAMRTLAS 780 
QDEAMRTLAS 



SEQ ID NO:233 PFH2 DNA SEQUENCE: 

35 Nucleic Acid Accession* NM.016029 

Coding sequence: 228-1097 (undenlned sequences correspond to start and stop cooons) 

Ari 1 11 21 31 41 51 

40 | | | | | | 

CTGCGATCCC GCAGGGCAGC GACGCGACTC TGGTGCGGGC CGTCTTCTTC CCCCCGAGCT 60 

GGGCGTGCGC GGCCGCAATG AACTGGGAGC TGCTGCTGTG GCTGCTGOTG CTGTGCGCGC 120 

TGCTCCTGCT CTTGGTGCAG CTGCTGCGCT TCCTGAGGGC TGACGGCGAC CTGACGCTAC 180 

TATGGGCCGA GTGGCAGGGA CGACGCCCAG AATGGGAGCT GACTGATATG GTGGTGTGGG 240 

TGACTGGAGC CTCGAGTGGA ATTGGTGAGG AGCTGGCTTA CCAGTTGTCT AAACTAGGAG 300 

vrr c scam gctgtcagcc agaagagtgc atgagctgga aagggtgaaa agaagatgcc 360 

TAGAGAATGG CAATTTAAAA GAAAAAGATA TACTTGTTTT GCCCCTTGAC CTGACCGACA 420 

CTGGTTCCCA TGAAGCGGCT AOCAAAGCTO TTCTCCAGGA GTTTGGTAGA ATCGACATTC 480 

TGGTCAACAA TGGTGGAATG TCCCAGCGTT CTCTGTGCAT GGATACCAGC TTGGATGTCT 540 

ACAGAAAGCT AATAGAGCTT AACTACTTAG GGAOGGTGTC CTTGACAAAA TGTGTTCTGC 600 

CTCACATGAT CGAGAGGAAG CAAGGAAAGA TTGTTACTGT GAATAGCATC CTGGGTATCA 660 

TATCTGTACC TCTTTCCATT GGATACTQfTG CTAGCAAGCA TGCTCTOCGG GGTTTTTTTA 720 

ATGGCCTTCG AACAGAACTT GCCACATACC CAGGTATAAT AGTPTCTAAC ATTTGCCCAG 780 

GACCTGTGCA ATCAAATATT GTGGAGAATT CCCTAGCTGG AGAAGTCACA AAGACTATAG 840 

GCAATAATGG AGACCAGTCC CACAAGATGA CAACCAGTCG TTGTGTGCGG CTGATGTTAA 900 

TCAGCATGGC CAATGATTTG AAAGAAGTTT GGATCTCAGA ACAACCTTTC TTGTTAGTAA 960 

CATATTTGTG GCAATACATG CCAACCTGGG CCTGGTGGAT AACCAACAAG ATGGGGAAGA 1020 

AAAGGATTGA GAACTTTAAG AGTGGTGTGG ATGCAGACTC TTCTTATTTT AAAATCTTTA 1080 

AGACAAAACA TGACTGAAAA GAGCACCTGT ACTTTTCAAG OCACTGGAGG GAGAAATGGA 1140 

AAACATGAAA ACAGCAATCT TCTTATGCTT CTGAATAATC AAAGACTAAT TTGTGATTTT 1200 

ACTTTTTAAT AGATATGACT TTGCTTCCAA CATGGAATGA AATAAAAAAT AAATAATAAA 1260 
AGATTGCCAT GAATCTTGCA AA 



SEQIDNCh234PFH2P 
Protein Accession i: KP_057113 



- n 1 11 21 31 41 51 

70 | | | III 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTGASS 60 

GIGEKLAYQL SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLFL DLTDTGSHEA 120 

ATKAVLQEFG RIDILVNNGG MSQRSLCMDT SLDVYRKLIE LNYLGTVSLT KCVLPHMIER 180 

KQGKTVTVNS ILGIISVPLS IGYCASKHAL RGFFNGLRTB LATYPGIIVS NICPGPVQSN 240 

IVENSLAGEV TKTZGNNGDQ SHXMTTSRCV RLHLISMAND LKEVWXSEQP PLLVTYLWQY 300 
MPTWAWWITN KHGKKRIENF KSGVDADSSY FKIFKTKHD 



ort SEQ ID N0235ACC5DNA SEQUENCE 

80 NudetoAcW Accession* NMJD00450 
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Coding sequence: M833 (underlined sequences correspond to start end stop codons) 

1 11 21 31 41 51 

I I I I I I 

ATGATTGCTT CACAGTTTCT CTCAGCTCTC ACTTTGGTGC TTCTCATTAA AGAGAGTGGA 60 

GCCTGGTCTT ACAACACCTC CACGGAAGCT ATOACTTATG ATGAGGCCAG TGCTTATTGT 120 

CAGCAAAGGT ACACACACCT GGTTGCAATT CAAAACAAAO AAGAGATTGA GTACCTAAAC 1B0 

TCCATATTGA GCTATTCACC AAGTTATTAC TGGATTGGAA TCAGAAAAGT CAACAATGTG 240 

TGGGTCTGGG TAGGAACCCA GAAA0CTCT6 ACAGAAGAAG CCAAGAACTG GGCTCCAGGT 300 

GAACCCAACA ATAGGCAAAA AGATGAGQAC TGCQTCGAGA TCTACATCAA GAGAGAAAAA 360 

GATGTGGGCA TGTGGAATGA TGAGAGGTGC AGCAAGAAQA AGCTTQCCCT ATGCTACACA 420 

GCTGCCTGTA CCAATACATC CTGCAGTGGC CACGGTGAAT GTGTAGAGAC CATCAATAAT 480 

TACACTTGCA AGTGTGACCC TGGCTTCAGT GGACTCAAGT GTGAGCAAAT TGTGAACTGT 540 

ACAGCCCTGQ AATCCCCTGA GCATGGAAGC CTGGTTTGCA CTCACCCACT GGGAAACTTC 600 

AGCTACAATT CTTCCTGCTC TATCAGCTGT QATAGGGGTT ACCTGCCAAG CAGCATGGAO 660 

ACCATGCAGT GTATGTCCTC TGGAGAATGG AGTCCTCCTA TTCCAGCCTG CAATGTGGTT 720 

GAGTCTGATG CTGTGACAAA TCCAGCCAAT GGGTTCGTGG AATCTTTCCA AAACCCTGGA 780 

AGCTTCCCAT GGAACACAAC CTGTACATTT GACTGTGAAG AAGGATTTGA ACTAATGGGA 840 

GCCCAGAGCC TTCAGTGTAC CTCATCTGGG AATTGCCACA ACGAGAAGCC AACGTCTAAA 900 

GCTGTGACAT GCAGGGCCGT CCGCCAGCCT CAGAATGGCT CTGTGAGGT6 CAGCCATTCC 960 

CCTGCT66AG AGTTCACCTT CAAATCATCC TGCAACTTCA CCTGTGAGGA AGGCTTCATG 1020 

TTGCAGGGAC CAGCCCAGGT TGAATGCACC ACTCAAGGGC AGTGGACACA GCAAATCCCA 1080 

GTTTGTGAAG CTTTCCAGTG CACAGOCTTG TCCAACCCCG AGCGAGGCTA CATGAATTGT 1140 

CTTCCTAGTG CTTCTGGCAG TTTCCGTTAT GGGTCCAGCT GTGAGTTCTC CTGTGAGCAG 1200 

GGTTTTGTGT TGAAGGGATC CAAAAGGCTC CAATGTGGCC CCACAGGGGA GTGGGACAAC 1260 

GAQAAGCCCA CATOTGAAGC TGTGAGATGC GATGCTGTCC ACCAGCCCCC GAAGGGTTTG 1320 

GTGAGGTGTG CTCATTCCCC TATTGGAGAA TTCACCTACA AGTCCTCTTG TGCCTTCAGC 1380 

TGTGAGGAGG GATTTGAATT ATATGGATCA ACTCAACTTG AGTGCACATC TCAGGGACAA 1440 

TGGACAGAAG AGGTTCCTTC CTGCCAAGTG GTAAAATGTT CAAGCCTGGC AGTTCOGGGA 1500 

AAGATCAACA TGAGCTGCAG TGGGGAGCCC GTGTTTGGCA CTGTGTGCAA GTTCGCCTGT 1560 

CCTGAAGGAT GGACGCTCAA TGGCTCTGCA GCTCGGACAT GTGGAGCCAC AGGACACTGG 1620 

TCTGGCCTGC TACCTACCTG TGAAGCTCCC ACTGAGTCCA ACATTCCCTT GGTAGCTGGA 1680 

CTTTCTGCTG CTGGACTCTC CCTCCTGACA TTAGCACCAT TTCTCCTCTG GCTTCGGAAA 1740 

TGCTTACGGA AAGCAAAQAA ATTTGTTCCT GCCAGCAGCT GCCAAAGCCT TGAATCAGAC 1800 
GGAAGCTACC AAAAGCCTTC TTACATCCTT TAA 



SEQ ID WO!236 ACCS Protein fiamenca: 
Protein Accession f. NP.000441 



1 11 21 31 41 51 

I I I I I I 

MIASQFLSAL TLVLLIKESG AWSYNTSTEA MTYDEASAYC QQRYTHLVAI QNKEEIEYLN 60 

SIItSYSPSYY WIGIRKVNNV WVWVGTQKPL TEEAKNWAPG EPNNRQKDED CVEIYIXREK 120 

DVGMWNDERC SKKKLALCYT AACTNTSCSG KGECVETINN YTCKCDPGFS GLKCEQIVNC 180 

TALES PEHGS LVCSHPLGNF SYHSSCSI5C DRGYLPSSME TMQCMSEGEtf SAPIPACNW 240 

ECDAVTNFAN GFVECFQNPG SFPWNTTCTF DCEEGFELMG AQSLQCTSSG NWDNEKPTCK 300 

AVTCRAVRQP QNGSVRCSHS PAGEFTFKSS CNFTCEEGFM LQGPAQVECT TQGQWTQQIP 360 

VCEAF QCTAL SNPERGYMKC LPSASGSFKY GSSCEFSCEQ GFVLKGSKRL QCCPTGEWDN 420 

EKPTCEAVRC DAVHQPPKGL VRCAHSPIGE FTYKSSCAFS CEBGFELYGS TQLBCTSQGQ 480 

WTEEVPSCQV VKCSSLAVPG KINHSCSGEP VFGTVCKFAC PEGWTLNGSA ARTCGATGHW 540 

SGLLPTCEAP TESNIFLVAG LSAAGLSLLT LAPFLLWLRK CLRKAKKFVP ASSCQSLESD 600 
GSYQKPSYXL 

SEQ ID N0237 PU28 DNA 8BQUEKCE 

Nude* Add Accession* K51002 

Coding sequence: 1-3793 (undenlned sequences correspond to start and stop codons) 

1 U 21 31 41 51 

I I I I I I 

ATGATGTGTG AAGTGATGCC CACGATTAAT GAGGACACCC CAATGAGCCA AAGGGGGTCC 60 

CAAAGCAGTG GCTCGGACTC AGACTCCCAT TTTGAGCAGC TGATGGTGAA TATGCTAGAT 120 

GAAAGGGATC GTCTTCTAGA CACCCTTCGG GAGACCCAGG AAAGCCTCTC ACTTGCCCAG 180 

CAAAGACTTC AGGATGTCAT CTATGACCGA GACTCACTCC AGAGACAGCT CAATTCAGCC 240 

CTGCCACAGG ATATCGAATC CCTAACAGGA GGGCTGGCTG GTTCTAAGGG GGCTGATCCA 300 

CCGGAATTTG CTGCACTGAC AAAAGAATTA AATGCCTGCA GGGAACAACT TCTAGAAAAG 360 

GAAGAAGAAA TCTCTGAACT TAAAGCTGAA AGAAACAACA CAAGACTATT ACTGGAGCAT 420 

TTGGAGTGCC TTGTGTCACG ACATGAAAGA TCACTAAGAA TGACGGTGGT AAAACGGCAA 480 

GCCCAGTCTC CCTCAGGAGT ATCCAGTGAA GTTGAAGTTC TCAAGGCACT GAAATCTTTG 540 

TTTGAGCACC ACAAGGCCTT GGATGAAAAG GTAAGGGAGC GACTGAGGGT TTCTTTAGAA 600 

AGAGTCTCTG CACTGGAAGA AGAACTAGCT GCTGCTAATC AGGAGATTGT TGCCTTGCGT 660 

GAACAAAATG TTCATATACA AAGAAAAATG GCATCAAGCG AGGGATCCAC AGAGTCAGAA 720 

CATCTTGAAG GGATGGAACC TGGACAGAAA GTCCATGAGA AGCGTTTGTC CAATGGTTCT 780 

ATAGACTCAA CCGATGAAAC TAGTCAAATA GTTGAACTAC AAGAATTGCT TGAAAAGCAA 840 

AACTATGAAA TGGCCCAGAT GAAAGAACGT TTAGCAGCCC TTT C1T U CUG AGTGGGAGAG 900 

GTGGAACAGG AACCAGAGAC AGGAAGAAAG GATCTCATTA AAACAGAAGA AATGAACACC 960 

AAGTATCAAA GGGACATTAG GGAGGCCATG GCACAAAAGG AAGATATGGA AGAAAGAATT 1020 

ACAACCCTTG AAAAGCGTTA CCTCAGTGCT CAGAGAGAAT CTACCTCCAT ACATGACATG 1080 
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AATGATAAAC TAGAAAATGA GTTAGCAAAT AAAGAAGCTA TCCTACGGCA GATGGAAGAG 1X40 

AAAAACAGAC AGTTACAAGA ACGTCTTGAG CTAGCTGAAC AAAAGTTGCA GCAGACCATG 1200 

ASAAAGGCTG AAACCTTGCC TGAAGTAGAG GCTGAACTGG CTCAGAGAAT TGCAGCCCTA 1260 

ACCAAGGCTG AA6AGA6ACA TGGAAATATT GAAGAAC6TA T6A6ACATTT AGAGGGTCAA 1320 

CTTGAAGAGA AGAATCAAGA ACTTCAAAGA GCTAGGCAAA GAGAGAAAA7 GAATGAGGAG 1380 

CATAACAAGA GATTATCGGA TACGGTTGAT AGACTTCTGA CTGAATCCAA TGAACGCCTA 1440 

CAACTACACT TAAAGGAAAG AATGGCTGCT CTAGAAGAAA AGAATGTTTT AATTCAAGAA 1500 

TCAGAAACTT TCAGAAAGAA TCTTGAAGAA TCTTTACATG AXAAGGAAAG ATTAGCAGAA 1560 

GAAATTGAAA AGCTGAGATC TGAACTTCAC CAATTGAAAA TGAGAACTGG CTCTTTAATT 1620 

GAACCCACAA TACCAAGAAC TCATCTAGAC ACCTCAGCTG AQTTGCG3TA CTCAGTGGGA 1680 

TCCCTAGTGG ACAGCCAGTC TGATTACAGA ACAACTAAAG TAATAAGAAG ACCAAGGAGA 1740 

GGCCGCATGG GTGTGCGAAG AGATGAGCCA AAGGTGAAAT CTCTTGGGGA TCACGAGTGG 1800 

AATAGAACTC AACAGATTGG AffEACTAAGC AGCCACCCTT TTGAAAGTGA CACTGAAATG 1860 

TCTGATATTG ATGATGATGA CAGAGAAACA ATTTTTAGCT CAATGGATCT TCTCTCTCCA 1920 

AGTGGTCATT CCGATGCCCA GACGCTAGCC ATGATGCTTC AGGAACAATT GGATGCCATC 1980 

AACAAAGAAA TCAGGCTAAT TCAGGAAGAA AAAGAATCTA CAGAGTTGCG TGCTGAAGAA 2040 

ATTGAAAATA GAGTGGCTAG TGTGAGCCTC GAAGGCCTGA ATTTGGCAAG GGTCCACCCA 2100 

GGTACCTCCA TTACTGCCTC TCTTACAGCT TCATCGCTGG CCAGTTCATC TCCCCCCAGT 2160 

GGACACTCAA CTOCAAAGCT CACCCCTCGA AGCCCTGOCA GGGAAATGGA TCGGATGGGA 2220 

GTCATGACAC TGCCAAGTGA TCTGAGGAAA CATCGGAGAA AGATTGCAGT TGTGGAAGAA 2280 

GATGGTCGAG AGGACAAAGC AACAATTAAA TGTGAAACTT CICCTCCTCC TACCCCTAGA 2340 

GCCCTCAGAA TCACTCACAC TCTCCCTTCT TCCTACCACA ATGATGCTCG AAGTAGTTTA 2400 

TCTGTCTCTC TTGAGCCAGA AAGCCTCGGG CTTGGTAGTG CCAACAGCAG CCAAGACTCT 2460 

CTTCACAAAG CCCCCAAGAA GAAAGGAATC AAGTCTTCAA TAGGAOGTTT GTTTGGTAAA 2520 

AAAGAAAAAG CTCGACTTGG GCAGCTCCGA GGCTTTATGG AGACTGAAGC TGCAGCTCAG 2580 

GAGTCCCTGG GGTTAGGCAA ACTCGGAACT CAAGCTGAGA AGGATOGAAG ACTAAAGAAA 2640 

AAGCATGAAC TTCTTGAAGA AGCTCGGAGA AAGGGATTAC CTTTTOCOCA GTGGGATGGG 2700 

CCAACTGTGG TCGCATGGCT AGAGCTTTGG TTGGGAATGC CTGCGTGGTA CGTGGCAGCC 2760 

TGCCGAGCCA ACGTGAAGAG TGGTGCCATC ATGTCTGCTT TATCTGACAC TGAGATCCAG 2820 

AGAGAAATTG GAATCAGCAA TCCACTGCAT CGCTTAAAAC TTCGATTAGC AATCCAGGAG 2880 

ATGGTTTCCC TAACAAGTCC TTCAGCTCCT CCAACATCTC GAACTCCTTC AGGCAACGTT 2940 

TGGGTGACTC ATGAAGAAAT GGAAAATCTT GCAGCTCCAG CAAAAACGAA AGAATCTGAG 3000 

GAAGGAAGCT GGGCCCAGTG TCCGGTTTTT CTACAGACCC TGGCTTATGG AGATATGAAT 3060 

CATCAGTGGA TTGGAAATGA ATGGCTTCOC AGCTTGGGGT TACCTCAGTA CAGAAGTTAC 3120 

TTTATGGAAT GCTTGGTAGA TGCAAGAATG TTAGATCACC TAACAAAAAA AGATCTCCGT 3180 

GTCCATTTAA AAATGGTGGA TAGTTTCCAT CGAACAAGTT TACAATATGG AATTATGTGC 3240 

TTAAAGAGGT TGAATTATGA CAGAAAAGAA CTAGAAAGAA GACGGQAAGC AAGCCAAGAT 3300 

GAAATAAAAG ACGTGTTGGT GTGGAGCAAT GACCGAATTA TTCGCTGGAT ACAAGCAATT 3360 

GGACTTCGAG AATATGCAAA TAATATACTT GAGAGCGGTG TGCATGGCTC ACTTATAGCC 3420 

CTGGATGAAA ACTTTGACTA CAGCAGCTTA ACTTTATTAT TACAGATTCC AACACAGAAC 3480 

ACCCAGGCAA GGCAGATTCT TGAAAGAGAA TACAATAACC TCTTGGCCCT GGGAACTGAA 3540 

AGGCGACTGG ATGAAAGTGA TGACAAGAAC TTCAGACGTG GATCAACCTG GAGAAGGCAG 3600 

TTTCCTCCTC GTGAAGTACA TGGAATCAGC ATGATGCCTG GGTCCTCAGA AACATTACCA 3660 

GCTGGATTTA GGTTAACCAC AACCTCTGGG CAATCAAGAA AAATGACAAC AGATGTTGCT 3720 

TCATCAAGAC TCCAGAGGTT AGACAACTCC ACTGTTCGCA CATACTCATG TCTCGAGTAA 3780 
GOGGOOGCTT TAA 



SEQ ID NQ338 P M28 Proleh sequence 
Protein Accession #: none found 



1 U 21 31 41 51 

I I I 1 I I 

HKCEVMPTIN EDTPMSQRGS QSSGSDSDSH FEQLMVNKLD ERDRLLDTLR ETQBSLSLAQ 60 

QRLQDVTYDR DSLQRQLNSA LPQDIESLTG GLAGSRGADF PKFAALTKEL NACREQLLEK 120 

EESISELKAB RNNTRLLLEH LECLVSRHER SLRMTWKRQ AQSPSGVSSE VEVLKALKSL 180 

FEHHKALDEK VRSRL BVSLB KVSALEEELA AANQKIYALR EQNVHIQRKM ASSEGSTBSE 240 

HLEGHBPGQK VHBKRLSNGS IDSTDETSQI VELQELLEKQ NYEMAQMKER LAALSSRVGE 300 

VEQEAETARK DLIKTEEKNT KYQRDIREAM AQKEDMEERI TTLEKRYLSA QRESTSIHDM 360 

HDKLENELAN KEAILRQMEB KNRQLQBRLB LAEQKLQOTM RKAETLPEVE AELAQRIAAB 420 

TKAEERHGNI EERHRHLBGQ LEEKNQELQR ARQREKMNEE HNKRLSDTVD RLLTESNERL 480 

QLHLKBRMAA LEEKNVLIQE SETFRKNLEE SLHDKERLAE EIEKLRSELD QLKHRTGSLI 540 

EPTIPRTHLD TSAELRYSVG SLVDSQSDYR TTKVTRRFRR GRMGVRHDEP KVKSLGDHEW 600 

NRTQQIGVLS SHPFESDTEM SDHJDDDRET IFSSMDLLSP SGHSDAQTLA MMLQBQLDAI 660 

NKEIRLIQHE KESTBLRAEE IEHHYASVSL EGLNLAHVHP GTSITASVTA SSLASSSPPS 720 

GHSTPKLTPR SPAREMDRMG VMTLPSDLRK HRRKIAWEE DGHEDKATXK CETSPPPTPR 780 

ALRMTHTLPS SYHNDARSSL SVSLEPESLG LGSANSSQDS LHKAPKKKGI KSSIGRLFGK 840 

KEKARLGQLR GFHBTEAAAQ BSLGLGKLGT QAEKDRRLKK KHELLEEARR KGLPFAQWDG 900 

PTWAWLELW LGHPAWYVAA CRANVKSGAI KSALSDTEIQ REZGISNPLB RLKLRLAIQE 960 

MVSLTSPSAP PTSRTPSGNV WVTHEEKENL AAPAKTKESB EGSWAQCPVF LQTLAYGDMN 1020 

HEWIGNEWLP SLGLPQYRSY FMECLVDARM LDHLTKKDLR VHLKMVDSFH RTSLQYQIMC 1080 

LKRLNYDRKB LERRREASQH EIKDVLVWSN DRIIRJflQAI GLREYAKNIL ESGVHGSLIA 1140 

LDENFDYSSL TLLLQIPTQN TQARQILERE YNHLLALGTE RRLDBSDDKN PRRGSTWRRQ 1200 
PPPREVHGIS KHFGSSETLP AGFRLTTTSG QSRKMTTDVA SSRLQRLDNS TVRTYSCLE 

SEQBN&239 PCM 0NA SEQUENCE 

Nuclei Add Accession* MyL016570 

Coding sequence: 1-1 134 (underlined seo^escorresporKJ to slartOT^ 
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1 11 21 31 41 51 

I I I I I I 

- ATGAGGCGAC TGAATCGGAA AAAAACTTTA AGTTTGGTAA AAGAGTTGGA TGCCTTTCCG 60 

5 AAGGTTCCTG AGAGCTATGT AGAGACTTCA GCCAGTGGAG GTACAGTTTC TCTAATAGCA 120 

TTTACAACTA TGGCTTTATT AACCATAATO GAATTCTCAG TATATCAAGA TACATGGATG 180 

AAGTATGAAT ACGAAGTAGA CAAGGATTTT TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

ACTGTTGCCA TGAAGTGTCA ATATGTTGGA GCGGATGTAT TGGATTTAGC AGAAACAATG 300 

GTTGCATCTO CAGATGGTTT AGTTTATGAA CCAACAGTAT TTGATCTTTC ACCACAGCAG 360 

10 AAAGA8T6QC AGAGGATGCT GCAGCTGATT CAOAQTAGGC TACAAGAAGA GCATTCACTT 420 

CAAGATGTGA TATTTAAAAG TGCTTTTAAA AGTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

GATGATTCAT CACAGTCTCC AAATGCATGC AGAATTCATG GCCATCTATA TGTCAATAAA 540 

GTAGCAGGGA ATTTTCACAT AACAGTGGGC AAGGCAATTC CACATCCTCG TGGTCATGCA 600 

1 - CATTTGGCAG CACTTGTCAA CCATGAATCT TACAATTTTT CTCATAGAAT AGATCATTTG 660 

15 TCTTTTGGAG AGCTTGTTCC AGCAATTATT AATCCTTTAG ATGGAACTGA AAAAATTGCT 720 

ATAGATCACA ACCAGATGTT CCAATATTTT ATTACAGTTG TGCCAACAAA ACTACATACA 780 

TATAAAATAT CAGCAGACAC CCATCAGTTT TCTGTGACAG AAAGGGAACG TATCATTAAC 840 

CATGCTGCAG GCAGCCATGG ACTCTCTGGG ATATTTATGA AATATGATCT CAGTTCTCTT 900 

ATGGTGACAG TTACTGAGGA GCACATGCCA TTCTGGCAGT TTTTTGTAAG ACTCTGTGGT 960 

20 ATTGTTGGAG GAATCTTTTC AACAACAGGC ATGTTACATG GAATTGGAAA ATTTATAGTT 1020 

GAAATAATTT GCTOTQ OT TO CAGACTTGGA TCCTATAAAC CTGTCAATTC UWOUTTTtf 1080 

GAGGATGGCC ACACAGACAA CCACTTACCT CTTTTAGAAA ATAATACACA TTGA 

25 SEQ ID N0*4Q PCM Protein BfflfflBg 

Protein Accession #: NP.057654 

1 11 21 31 41 51 

on I I I I I I 

JU hrblnrkktl slvkeldafp kvpesyvets ASGGTVSLIA fttmalltik EFSVYQDTWM 60 

KYEYEVDKDP SSKLRINIDI TVAMKCQWQ ADVLDLAETM VASAD6LVYE FTVFDLSPQQ 120 

KEWQRMLQLI QSRLQEEHSL QDVXFKSAFK STSTALPPRE BDSSQSFRAC RIHGHLYVNK 180 

VAGNFHITVG KAIPHPRGHA HLAALVNHES YNFSHRIDHL SFGBLVPAIZ NPLDGTEKIA 240 
_ IDHNQMFQYF ITWPTKLHT YKISADTHQP SVTERERIIN HAAGSHGVSG IFHKYDLSSL 300 
35 MV TVTEE HMP FWQPFVRLCG IVGGIFSTTG MLHGIGKFIV EIICCRFRLG SYKPVNSVPP 360 

EDGHTDNHLP LLENNTH 

SEQ D N0341 PBA7 DMA SEQUENCE 

Nuc^AtM Accession* AA219134 

40 Cocfng sequence: 24-1815 (underlined sequences correspond to start and stop codons) 

AATTOGOOCT TCCTTAATTA AG CATGT TTA GCTTCCIGTC ATCTGTCACT OCTOCTOTCA 60 
■ e GTGGCCTCCT GGTGGGTTAT GAACTTGGGA TCATCTCTGG GGCTCTTCTT CAGATCAAAA 120 
45 CCTTATTAGC CCTG AGCTGC CATGAGCAGG AAATGGTTGT GAGCTOOCIC GTCATTGG AG 180 

CCCTCCTTGC CTCACTCACC GGAGGGGTCC TGATAGACAG ATATGGAAGA AGGACAGCAA 240 

TCATCTTGTC ATCCTGCCTG CTTGGACTCG GAAGCTTAGT CTTGATCCTC AGTTTATCCT 300 

ACACGGTTCT TATAGTGGG A CGCATTG(XA TAGGGGTTTC CATCTOCCTC TCTTCCATTG 360 
„ OCACTTGTGT TTACATCGCA GAGATTGCTC CTCAACACAG AAGAGGCCTT CTTGTGTCAC 420 
50 TGAATGAGCT GATGATTGTC ATOGGCATTC TTTCTGCCTA TATTTCAAAT TACGCATTTO 480 

OCAATGI 1 11 CCATGGCTGG AAGTACATGT TTGGTCTTGT GATTCCCTTG GGA GI 1 1 IC C 540 

AAGCAATTOC AATGTA1 ITT CI lUJIUCAAGCCCTCGGi I 1C1U01UATQ AAAGGACAAO 600 

AGGGAGCTGC TAGCAAGGTT CTTGGAAGGT TAAGAGCACT CTCAGATACA ACTGAGGAAC 660 

TCACTGTGAT CAAATCCTCC CTGAAAGATO AATATCAGTA CAGTTTTTGG GATCTGTTTC 720 
55 GTTCAAAAGA CAACATGOGG AOCCGAATAA TGATAGG ACT AACACTAGTA TTTTTTCJTAC 780 

AAATCACTGG CCAAOCAAAC ATATTGTTCT ATGCATCAAC TGTTTTGAAG TCAGTTGGAT 840 

TTCAAAGCAA TGAGGCAGCT AGCCTCGCCT CCACTGGGGT TGGAGTCGTC AAGGTCATTA 900 

GCACCATCCC TGGCACTCTT CTTGTAGACC ATGTCGGCAG CAAAACATTC CICIGCATTG 960 

GCTCCTCTGT G ATGGCAGCT TCGTTGGTG A CCATGGGCAT CGTAAATCTC AACATCCACA 1020 
60 TGAACTTCAC CCATATCTGC AGAAGCCACA ATTCTATCAA CCAGTCCTTG GATG AGTCTG 1080 

TGATTTATGG ACCAGGAAAC CTGTCAACCA ACAACAATAC TCTCAG AGAC CACTTCAAAG 1140 

GGATTTCTTC CCATAGCAGA AGCTCACTCA TGCCCCTGAG AAATGATGTG GATAAGAGAG 1200 

GGGAGACG AC CTCAGCATCC TTGCTAAATG CTGGATTAAG CCACACTGAA TAOCAG ATAG 1260 

TCACAGACOC TGGGGACGTC CCAG C1 1 1 IT TOAAATGGCT QTCCTTAGCC AGCTTGCTTO 1320 
65 TTTATGTTGC TGCTTTTTCA ATTGGTCTAG GACCAATGCC CTGGCTGGTG CTCAGCGAG A 1380 

TCrnOCTOG TGGGATCAGA GGACGAGCCA TGGCTTTAA C TTCTAOCATG AACTOOGGCA 1440 

TCAATCTCCT CATCTCGCTG ACATTTTTGA CTGTAACTGA TCTTATTGGC CTGOCATGGG 1500 

1GTGC1 1 1 A T ATATACAATC ATGAGTCTAG ATCTTATTGG CCTGCCATGG GTGTGCTTTA 1560 
__ TATATACAAT CATG AGTCTA GCATCCCTGC TTTTTGTTGT TATGTTTATA CCTGAGACAA 1620 
70 AGGGATGCTC TTTGGAACAA ATATCAATGG AGCTAGCAAA AGTGAACTAT GTGAAAAACA 1680 

ACATTTGTTT TATGAGTCAT CACCAAG AAG AATTAGTGCC AAAACAGCCT CAAAAAAGAA 1740 

AACCCCAGG A GCAGCTCTTG GAGTGTAACA AGCTGTGTGG TAGGGGCCAA TCCAGGCAGC 1800 

TTTCTCCAGA GACdAATGG CCTCAACACC TTCTGAACGT GGATAGTGCC AGAACACTTA 1860 
__. GGAGGG1U1C 1T1UGACCAA TGCATAGTTQ CGACTCCTGT G C1UC1 1 1 1 CAGTGTCATO 1920 
75 GAACTGGTTT TGAAGAG ACA CICTGAAATG ATAAAGACAG CCTTTAATCC CCCTCCTCMC 1980 

CAGAAGGAAC CTCAAAAGGT AGATGAGGTA CAAGGTCCTA AOTGATCTCT TTTTC T G AGC 2040 

AGG ATATCAG GTTAAAAAAA AAAAGTTACT GGCTGGTTTA ATACTTTCTA CCTTCTTCAC 2100 

AGAGCAGCCT TTGAATAGAC TATGTCCTAG TGAAGACATC AACCTCCGCC TTAAGCTATG 2160 
„ TATGTATGGA GGCCAGTCGC AGCTTTATTA TGCAGACACA CAAGTGGTCT GGACATGAGG 2220 
80 GTACAGTTTC TGCCTACCAA GACACTACTT GCACTGG ATC TTACGCAAAA AAGAACCAG A 2280 
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ACACACAGTG TGGACAACTG CCCATATATT CTATCTAGAT TAGGAQAOGG TCCTCGCTAG 2340 
GATTTTAGTG GTAATTCCTA GTTACATTCA ACAAGTATAA AGATTATAGA GCTTATTTTA 2400 
TGAACTATAA ACTATAATTT AATGCAAAAT ATOCTTTTAT GAATTTCATG TTAATATTGT 2460 
GAAATATTAA AATAATTOCR CAATAGTTGA GAAAAATGAG C AU11UIC CATTTTTAAA 2520 
AAATGCATAG AAAAGACAAT TTTAAAATOC TGGG ACCATA TTTATTTAGA AGTAGCTGTT 2580 
AGTAAAACAT TAGAAAAGGA GTCAGGCCAT TAGGTTATTT ATCCAAATCT CTAAGCAATT 2640 
AGGTTQAAGT TATTAAGTCA AGCCTAG AAA AGCTGCCTCC TTGTAAGGCT TTCATGACAA 2700 
TGTATAGTAA TCCACAGTCT (XAATTCTTC ACACTCCTCA GGAATATCAC TACCTCAGGT 2760 
TACGGTACAC AGGCTATAAT TG ATGATG AT GTTCAG ATAA CTG AA G ACAC AATAAATGAC 2820 
ATTCAGACAT CAGGAMAAWW CCCTCATGTT CTTTTCTATG ATGGCCACCT GTACCAGCAA 2880 
CGTGGGTTTC ACCCACACAA CGATGAACTG TTCTCTTACT TCTOCAGTTG ATTTTAAAGA 2940 
CTTGTTAAGA GGTCTTACTA ATAAAATTTG GGTATG ATAG AAAAWCCACA ATCAAAWCTT 3000 
GAAOCAAAT A ACAT ATTAAA TTACTAATAT TTAAGTG ATG G AAG ACACAC AAAAAACTTA 3060 
AAAGCACGAA CAACCTAACT TGAAAAAGAA TTTTAAAATA TGATTAACCT GAAGAAAAGA 3120 
G AATCCTAAG AGGCAAAGCT CCTTTTTATT TAGCTTGG AA TTTTCCTATT GGTTCCTAAC 3180 
AAACTGTCCC AATGTCATAT AAGGAAACAT GATCTATTAC ATTCCTTTAT AACAATGTGG 3240 
AGAGACTATA AACCTATGTA AGT AGTAAAA CTATATYAGA G ACTCAGG AG ACTGACTAAA 3300 
AGGCCTGGAT CTGCAGTGTA TTATCTGTAT AAAAATTGGC AGGGGGAAGC TAAAAGGAAA 3360 
GGAGATTGG A GATCTCAATT CTATCATGGT GTATTTCATA OGCAAATCAG AGCATGCATT 3420 
G i l 1 11 A 1 1UGAAAGA GAAGGGAAGT GTG i 1C1GCC CCATGTTTCC TTCOGTGTTT 3480 
ATAGTTCAAA CTCTATATAT ACTTCAGGTA TTTTTTGTTT AGCCXTTCAT TATAAATGGG 3540 
CAGGAAATTG TTTATCAACC TAGCCAGTTT ATTACTAGTG ACCTTGACTT CAGTATCTTG 3600 
AGCATTCrTT TATATTnTC TTTTATTATC CTGAGTCTGT AACTAAACAA TTTTGTCTTC 3660 
AAATTTTTAT CCAATATCCA TTGCACCACA OCAAATCAAG CTTCTTGATT TTCAAAAATA 3720 
AAAAGGGGG A AATACTTACA ACTTGTACAT ATATATTCAC AG 11 1 1 1 ATT TATAAAAAAA 3780 
ATTTACAGTA CTTATGGAGA GCCAGCAGAA GACATCAGAG CACTCACTTC TTOCCATCTT 3840 
TGTTAAGGTT AGOGAATTAC GCATGGACAC TGTTAGGTGA GGCTCATTCG GCAGOCCTGA 3900 
AAACAAACCT GGTCACACTG TCTTTAOOCT CTOCCTTCAG ATAAAGCACT TCG ATTATCT 3960 
ATTG ATCTGC CCAGTTTTCA AGTCATGCGA ATACTAAAAA GGTTACATCA TCTGGATCTG 4020 
TAOCTTGGCT ATATAAGGAT GTTTTCOOCC TATTCTATGT TTCTTTTTTT GGTGAACATT 4080 
GAAAAACAGO AGGTGACTTA TTACTGTTAA TTAAAACTAA ATGAAAAATG TCAA GTC TTT 4140 
AAAACAGTG A GCTTGTAACT CTTTCATGTA ATTTTATTCT CTATGAATTT GGCTATCCTA 4200 
CTGAATCTTA AAATAAAGG A AATAAACACT TTTTTTTWAA AAAAAAGGAA AAATAMAARW 4260 
MWAAAAATCT CAATG AAATA TTTCACAAGA AGGAAAAA 



Protein Accession*: AAF91431 

MFTFLSSVTA AVSGLLVGYE LGH5GALLQ DCTLLALSCH EQEMWSSLV IGALLASLTG 60 
GVUDRYGRR TAHLSSCLL GLGSLVULS LSYTVIJVGR IAIGVSBLS SIATCVY1AB 120 
IAFQHRRGLL VSLNELMIVI GILSAYISNY AFANVFHGWK YMFGLVIPLG VLQAIAMYFL 180 
PPSPRFLVMK GQEGAASKVL GRLRALSDTT EELTVKSSL KDEYQYSFWD LFRSKDNMRT 240 
RIMIGLTLVF FVQITGQPNI LFYASTVLKS VGFQSNEAAS LASTGVGWK VISTIPATLL 300 
VDHVGSKTFL OGSSVMAAS LVTMGTVNLN IHMNFIHICR SHNSINQSLD ESVTYGPGNL 360 
STNNNTLRDH FKGISSHSRS SLMPLRNDVD KRGBTTSASL LKAGLSHTEY QIVTDPGDVP 420 
AFLKWLSLAS LLVYVAAFSI GLGPMPWLVL SEIFPGGIRG RAMALTSSMN WG1NLLKLT 480 
FLTVTDLIGL FWVCFJYTIM SLDUGLPWV CFTYTIMSLA SLLFVVMFIP ETKGCSLEQI 540 
SMELAKVNYV KNNK3MSHH QEELVPKQPQ KRKPQEQLLE CNKLCGRGQS RQLSPET 

SECUD NQgft PAP4 mwmK 

Nud^C Acid Accession!: AA172056 

Coding sequence: 121-339 (underilnedse^piBncescojrespond to start and ^opcodons) 



TTTAGCCACC AGAGGANTTC TCTTG AAATA CCCAAAATCC ATCAGTATCT TG AATCATGC 60 
TGGATTTTG A AGAATTCTTA AGAAGOCATG TAAAGGGGGC TCTCTGGCCT TG AAATAGTG 120 
ATG'l'l'lTlTA TACAGAAAGG AGAATGCAGA ATGGTCAGAC TATCATGCAC TGTTAAATTT 180 
GATTTCAAGA AATTACAGGA AAACTTTCCA AAGTTCCATC TCACAGAANN TTATTTTNCC 240 
AAGAATTCCA AGATAAGTTT AGTTTTATGG AAGACTTTTA TGTGGTTTTT ACTCACTCTT 300 
CATCTCAGAC ATCGACAGAT GATTACATCA CTTAIAQTTC TAGTAAATTT ATTAATATAA 360 
AACTCAG AGA CATTOCAATA TCCACATTGC TTACAOCATT AGGCATAG AT TCAGTGTCAG 420 
CTATGACAAT TG AAAATGAG CTGTTTTGTG ATTTAAAGGT TTAAATTTCT CTAAOCAAAC 480 
TGCTTGATCC AG ATGCAGGA CTGCAAATGT TAATATTTGT TCTGG AAGAA CAATCAAATA 540 
AGACTTAAGA GGAAAGGGAA TGGCCACAAT CCACCTGAAA TTTTTTCTTA AAAAGTGTGC 600 
AGOCTACTAA ATCAGAATGA AAATAGAAGT ACAAGATTAT AAACAAAATG CAATCAAACT 660 
TTTCTTAAGC TTAOCTAAAG TTATTTCATC TGAAAATTTC AAGCAACTTT GTTCAACATT 720 
AAATTG ACAA TCTAAACTAA CAAGTCTTTT GA ATTTAT GC ATGGTAGTAA ACATTCTCTC 780 
TATTAACTTT ATTAOCTAAG GCTAAAOCTA AAATTTTTAA GCAAAATTAG AAAAATAGTC 840 
TTCACTCATC AAAAAATAAA GTTTGTTACA TTTAGTATTT TCCCAATAAA ATTGGTCGTT 900 
CrTGGTTTTT TATTTGGAG A GTCTGTGCAA AATGTCACTA AAAATAAATT AGCACT AGAA 960 
ATTATTTCTA AAT ACCAAA 

SEQ ID N0244 PBQ8 DNA SEQUENCE 

NackicAcM Accession; X51405 

Oodng sequence: 3-1721 (underflned sequence anresponds to start and step codon) 



X 11 21 31 41 51 
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5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



A AATO GCGTG 
CCTGGGCTCC 
GTGGCOCCAG 
GGTGCUUAAC 
AAGAGGCCGC 
GACGGGGCAG 
GCQGOGAAGC 
AGCAAGAGGA 
TOTCC3GTGTG 
AGGGCCGGGA 
AGCCTGAATT 
TCATTTTCTT 
ACCTGATCCA 
AGGCAGCGTC 
GAATAGATCT 
AAGG TC GTCC 
AGCTTGCTCC 
CTGCCAATCT 
GTAGTGCTCA 
CATACTCTTC 
ATCATGACAG 
GAGGGATGCA 
GCTGTGAGAA 
CCCTCATTAG 
AAG6TAACCC 
CCGCAAAGGA 
CAGCTCCAGG 
GGGTTGATTT 
TCGAATGGTG 
CTTTAAATCT 
CAGTTAATAC 
AAAIA AATAG 
TATTCATTTT 
ATCCTAGGCT 
TCTAGCTTTC 
AATGCTATTG 
TAAATAGTTC 
TGTTAATGCA 
AATAAAAATT 
TTAACACTAC 
CTGAATGAAT 



C CCGTCflCTC 
GOGGCCAGTA 
TGCGCGGGCT 
TTGCCGCCCC 
CCGCGTAGGA 
OGOGCTGCTG 
CCAGGAGCCC 
CGGCATCTCC 
GCTGCAGTGC 
GCTCCTGGTC 
TAAATACATT 
GGCCCAGTAC 
CAGTACCCGC 
TCAGCCTGGT 
GAACCGGAAC 
AAATAATCAT 
TGAGACCAAG 
CCATGGAGGA 
CGAATACAGC 
TTTCAACCCG 
CAGCTTTGTA 
AGACTTCAAT 
GTTCCCACCT 
CTAOCTTGAG 
AATTGCGAAT 
TGGTGATTAC 
CTATCTGGCA 
TGAACTGGAG 
GAAAATGATG 
ATCTATATAA 
TTAACATTGA 



CCTACCTATA 
TAAATGCAAT 
AAAAATTAGT 
AAAAGGTTAA 
AGTATAAATT 
TnTTGATGG 
GACTTCTTGC 
TTAAAAGTTT 
AAAGGTTAAA 



CGOOGGOCCC 
GTGCAGCCCG 
GACACTCATT 
CAGCAGCGCC 
AGGGACGGCC 
GCTCTGTGOG 
GGGGGGCCOG 
TTCGAGTACC 
ACCGCCATCA 
ATOGAGCTGT 
GGGAATATGC 
CTATGCAACG 
ATTCACATCA 
GAACTCAAGG 
TTTCCAGACC 
CTGTTGAAAA 
GCTGTCATTC 
GACCTTGTGG 
TOCTCCCCAG 
GCCATGTCTG 
GATGGAACCA 
TACCTTAGCA 
GAAGAGACTC 
CAGATACACC 
GCCACCATCT 
TGGAGATTGC 
AT AACAAA GA 
TCATTTrCTG 
TCAGAAACTT 
TGTAGTATGA 
TTTATTTTTT 
AAAAATATAA 
TTACACAAAA 
ATTCCTGGTA 
GAAGTTCTTT 
CAGATACAGC 
GTCGTTTTTT 
GAAGAAAAGG 
TTGTACATAT 
AGGGTTTTCT 
AAAAAATCCC 



CTGCCTCGCA 
TGGAGOCGCG 
CAGCCGGGGA 
GGCGGGCTAA 
GGCGGCGGCG 
GGGCACTGGC 
CGGCGGGCAT 
ACCGCTAOCC 
GCAGGATTTA 
CCGACAACCC 
ATGGGAATGA 
AATACCAGAA 
TGCCTTCCCT 
ACTGGTTTGT 
TGGATAGGAT 
ATATGAAGAA 
ATTGGATTAT 
CCAATTATCC 
ATGACGCCAT 
ACCCCAATCG 
CCAACGGTGG 
GCAACTGTTT 
TGAAGACCTA 
GAGGAGTTAA 
CCGTGGAAGG 
TTATACCTGG 
AAGTGGCAGT 
AAAG GAAAGA 
TAAATTTTTA 
TGTAATGTGG 
AATCATTTAA 
GAACTTGATA 
AAGTATAGAA 
TTATTTACAA 
TACTGTAATT 
TCGGAGTTGT 
TCT TGTGCTO 
TACATGTTTA 
AfiGAGCAATA 
CTT QQ TT U T A 
CAGTGAAAAA 



I 

GTGGTTTCTC 
QCTTTBO00S 
AGGTGAGGCG 
GCCCAGGGCC 
GAGCGCAGCG 
TGCCTGCGGG 
GAGGCGGCGC 
CGAGCTGCGC 
CACGGTGGGG 
TGGCGTCCAT 
GGCTGTTGGA 
GGGGAACGAG 
GAACCCAGAT 
GGGTCGAAGC 
AGTGTACGTG 
AATTGTGGAT 
GGATATTCCT 
ATATGATGAG 
TTTCCAAAGC 
GCCACCATGT 
TGCTTGGTAC 
TGAGATCACC 
CTGGGAGGAT 
AGGATTTGTC 
AATAGACCAC 
AAACTATAAA 
TOCTTACAGC 
AGAGGAGAAG 
AAAAGGCTTC 
TCTTTTTTTT 
ATATTAATCA 
TATTTCATTC 
AAGATTTAAQ 
TGCAGAATTT 
GGTGACAATG 
GAGCACTCTA 



CAAAGAGGTT 
CTATTATATT 
GAGTGGCCCA 
AAA 



CTGCAGCTOC 
TCTCCTCTGG 
AGTAGAGGCT 
GGGCAGACAA 
ATGGCCGGGC 
TGGCTCCTGG 
CGGCGGCTGC 
GAGGCGCTCG 
CGCAGCTTCG 
GAGCCTGGTG 
CGAGAACTGC 
ACAA3TGTCA 
GGCTTTGAGA 
AATGCCCAGG 
AATGAGAAAG 
CAAAACACAA 
TTTGTGCTTT 
ACGCGGAGTO 
TTGGCCCGGG 
CGCAAGAATG 
AGCGTACCTG 
GTGGAGCTTA 
AACAAAAACT 
CGAGACCJTC 
GATGTTACAT 
CTTACAGCCT 
CCTGCTGCTG 
GAAGAATTGA 
TAGTTAGCTG 
AGATTTTGTG 
AdTTCCTEA 
TCTTATATAG 
TAATTTTGCC 
TTTGAGTAAT 
TCACATAATO 
CTGCAAGACT 
AGCATGATCT 
TTATGAAAAG 
ATGTAGTOCG 
GAATTGCATT 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



Protein Accession?: 



SEQ ID N0345 PBQ8 Protein sequence 
P16870 



MAGRGGS ALL ALCGALAACG WLLGAEAQEP GAPAAGMRRR RRLQQEDGIS FEYHRYPELR 60 
EALVSVWLQC TAJSRIYTVG RSFEGRELLV ELSDNPOVH EPGEPEFKYI GNMHGNEAVG 120 
RELLIFLAQY LCNEYQKGNE TTVNUHSTR IHIMPSLNPD GFBKAASQPG ELKDWFVGRS 180 
NAQGDDLNRN EPDLDRIVYV NEKEGGPNNH LLKNMKKIVD QNTKLAPETK A VIHWIMDIP 240 
FVLS ANLHGG DLV ANYPYDE 1RSGS AHEYS SSPDD AIFQS LARAYSSFNP AMSDPNRPPC 300 
RKNDDDSSFV DGTINGOAWY SVPGGMQDFN YLSSNCFETT VELS CEKPPP EBTLKTYWBP 360 
NKNSLISYLE QERGVKGFV RDLQGNPIAN ATBVEGIDH DVTSAKDGDY WRLLIPGNYK 420 
LTASAPGYLA ITKKVAVFYS PAAGVDFELE SFSERKEEEK EELMEWWKMM SHTLNF 



SaiDN0!246PBY40NA fi e fl uen e o 
Nucfelc Acid Accessions AF038B68 

Codtog sequence: 



91*1107 (undefined sequence corresponds to start and stop codon) 



l 
I 

GGGGCGACGT 
GTCGGGTGGG 
GACOCGGAXC 
CCACCAGGAC 
GTGAAGATGC 



11 
I 

GAGCGCGCAG 
TGACGCCGAG 
TCAACAATCC 
TTGATGAATA 
CTAATGTACC 



CAAGAAGAAC 
CTCAGTCAAC 
OCT TO CT T C T 
CTTATOTACT 
TTGSCT TOOT 

AGGAGTGACA 
GTACATGTAC 
CTTACTGGTC 
TTCACAGCAT 
ACAACAGGTG 
AAAACTGTOC 



TAGAAAGAAA 
ATGGTAGAAA 
ATCAGGAATT 
ACTTGTGGAT 
TTTGTGTTGA 
TTACTCCTTG 
GTTCATOT3M3 
TCCAAGCTGC 
TCAACCAAAA 
CAGCAGTCAT 



AGACOGCAGC 



21 
I 

GGQGGCGGCG 
AGCCAGAGAG 
CTTCAAGGAT 
TAATCCATTC 
CAATACACAA 
AAAGGAACAT 
AGCCGCAGAA 
AAATATTrGG 
TTCTQTAGAC 
GTTCCATGCA 
TTCTGCAAGA 
TTCATTTGTC 
ATTCTTTGTA 
AGGATTTCAT 
TATTCCTGTT 
CTCACTAGTT 
GAAGGCCCAA 
tPGCAAATGCA 



31 
I 

GCCTCGCCTC 
ATQ TCGQATT 
CCATCAGTTA 
TCGGATTCTA 
CCAGCAATAA 
GCATTGGCCC 
TTAGATCGTC 
CCAOCTCTTC 
ATTCCT OTAG 
GTAACACTGT 



41 
I 

GTCTCTCTCT 
TCGACAGTAA 
CACAAGTGAC 
GAACACCTCC 
TGAAACCAAC 
AAGCTGAACT 
GGGAACGAGA 
CTAGCAATTT 
AATTCCAAAA 
TTCTAAATAT 



51 
I 

CTGCGCCTGG 
CCCGTTTGCC 
AAGAAATGTT 
AOCAGGCGGT 
AGAGGAACAT 



TGTTGGTACA 
TTCTTCTTCG 
AACTGGGGCA 
GGAATCATGA 
ATCTT CAAAA 
CAGGAGTTTG 
GCTTCAACTG 



GACCACTTTA 
TCTATATTTG 
ATTGTGGTTG 
TGATAATCAT 
AAQTAC ATGG 
CAACAGGTGT 
CAGCATCTAG 



AATGCAAAAC 
TCCTGTCGGA 
GACAGTAAAG 
CTTCGGATGC 
TATCCTGTGG 
TGOAGCTTTC 
TCAGTTTGCT 
GATTTCATCC 
AGCAGCACTT 
ACTATATCGC 
GATGTCCAAC 
TGCAGCTCAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
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AATGCTTTCA AGGGTAACCA GATTTAAQAA TCTTCAAACA ATACACTGTT ACCTTTTGAC 1140 
TGTACCTTTT TCTCCAOTTA CTGTATTCTA CAAATATTTT TATGTTCAAA ACACACAGTA 1200 
CASACAGCAT GGATATTTCC TGTTCACTTG TGCATGGGCT AAAACCAGGA AAACTTCCTT 1260 
GTCTTATTAC TTTACCTAAT AGTTTCTTAA TATTTCAGTG CCCCTTGCAG AAAAAATATT 1320 
ACATGCTAAA TAAATATTCT CCATATTTTT GGGGGATGAC ATTCAGTGAA TTATTTCAGT 1380 
GGTGACCCAC TGAAAATTAA TAATGGTACT TATGATTAAA AACGCATTTA ATACTAACTG 1440 
CAGTAGTTCT TTCAAGAATC TTTAGAGATA AGGATTGCAC ATTCGAAAAG TAAACCATGT 1500 
TTCATTCCTT TTTCCCTATT TATATTGAAA GAAATAGGCC AGCAGAGACT TAGGGATTTT 1560 
AAATTGGCTT GCTTTTTAGC TGTTTCAGTC ACCAGTGAAG AGCCTATGTG CATTTTGTAG 1620 
TAGATAATGT AAAATOTGTC ATCTTTTTCT IT T ClTl'm ' TTAGAATAGC TGATATTTTG 1680 
ATAACAATCT CTAATTTGCA TGGGCACCAC ATTTCTTATA TTAAAAGAAT TAGTGTTTTG 1740 
GCTTCTGTAC TGCTTATGGT TGTAGGATTC AGGGGTTAAT GGAATCACAG AAATGATATT 1800 
CTGCAAGAAT TTCTTTTAAA TAAAAAGTTT GGGGGTGCAA TATAAGAAGT TTATATAATA 1860 
TGCAGTACAT TATCCAAAAG AGAAGGTAGT TAATGCAGTA GAAAGTAGTG GTAATAATTC 1920 
CTTTOT 

$EQ |P ftft 2ff PPY4 Protein jWffl 

Protein Accession #: 

MSDFDSNPFA DPDLNNPFKD PS VTQVTRNV PPGLDEYNPF SDSRTPPPGG VKMPNVPNTQ 60 
PAIMKPTEEH PAYTQIAKEH ALAQAELLKR QEELERKAAE LDRREREMQN LSQHGRKNIW 120 
PPLPSNFPVG PCFYQEFSVD IPVEFQKTVK LMYYLWMFHA VTLFLNIFGC LAWFCVDS AR 180 
AVDFGLSILW FLLFTPCSFV CWYRPLYGAF RSDSSFRFFV FFFVYICQFA VHVLQAAGFH 240 
NWGNCGWISS LTGLNQNIPV GIMMHIAAL FTASAVESLV MFKKVHGLYR TTGASFEKAQ 300 
QHFATGVMSN KTVQTAAANA ASTAASS AAQ NAFKGNQI 



Nucleic Acid Accessions none found 

Coding sequence: 1-613 (undartlned sequence corresponds to start and stop codon) 



ATG AGAGACA ATAAATCGTG TO CTITmC ATGGGAAAGT TAAAT U1 1 IG 11 IT G AAGGC 60 
ACAGTAATAG CAGGCTATTC AGTGTTTGCC ACTACCTGCA TCATTCATCT GGCTGTAGCT 120 
AGTGCACT AC AATTTOCTAA AAAGTCTTCT CACCCTCACA GG ACTGCTCT ACATCTGGCC 180 
TCTGCCAATG GAAATTCAGA AGTAGTAAAA CTCCTGCTGG ACAGACGATG TCAACTTAAT 240 
ATCCTTGACA ACAAAAAGAG GACAGCTCTXj ACAAAGGCCG TACAATGCCA GGAAGATGAA 300 
TGTGCGTTAA TGTTGCTGGA ACATGGCACT GATCCGAATA TTCCAGATGA GTATGGAAAT 360 
ACCGCTCTAC ACTATGCTAT CTACAATGAA GATAAATTAA TGGOCAAAGC ACTGCTCTTA 420 
TACGGTGCTG ATATCGAATC AAAAAACAAG CATGGCCTCA CACCACTCTT ACTTGGTGTA 480 
CATGAGCAAA AACAGCAAGT GGTGAAATTT TT AATCAAG A AAAAAGCAAA TTT AAATGCA 540 
CTGGATAGAT ATGGAAGGTG TGTGACCTTG GGAACGTTAT TTACCACCAA ATATGTTGTC 600 
ATATATGAAA AGTAG 



SEQ fO N0349 PBH2 Protein sequence: 
Protein Accession*: none found 

MRDNKSCAFF MGKLNVCFEG TVIAGYS VFA TTCUHLAVA SALQFFKKSS HPHRTALHLA 60 
SANGNSEWK ULLDRRCQLN ILDNKKRTAL TKAVQCQEDE CALMLLEHGT DPNIPDEYGN 120 
TALHYAIYNE DKLMAKALLL YGADIESKNK HGLTPLLLGV HEQKQQWKF UKKKANLNA 180 
LDRYGRCVTL GTLFTTKYW IYEK 



$5QIDNf>2S0ppJ1PNA sequence, 
^efc Add Accession*: XMJ05829 

Coding sequence: 1-3043 (underlined sequence corresponds to start and stop codon) 

ATGGTGATCA TCTATCTTTC TTTCTGCAAT TATTACATGG AGTTCTACAG AGAAGAGCTT 60 
CCCCACATTG ACTATTTGAT TGACATTCAG TTTGCAACAG GAAAGGTTAC TCAGCCGGGA 120 
GAGGACACTT CCTACCATCA ATGCGCTCAG CTTGAAGCCA GAGACGAAGG CACCGACAGT 180 
TTATTATTAA ACAATGGCAG CAGCGCCACG CTGAAGACAC GAACGCGCTG TTATGGAACC 240 
CCCAGAGGTC TCCCCCATCG TAGCCTGCTC CAGCCGACTC CGCCCACATG TAAAACGAAG 300 
ATCAGGAGCA GATTTGAAGA ATTACAAAGT G AATTGGTGC CAGTCAGCAT GTCAGAG ACA 360 
GACCACATAG CCTCTACTTC CTCTGATAAA AATGTTGGG A AAACACCTGA ATTAAAGG AA 420 
GACTCATGCA ACTTGTTTTC TGGCAATGAA AGCAGCAAAT TAGAAAATGA GTCCAAACTA 480 
TTGTCATTAA ACACTGATAA AACTTTATGT CAACCTAATG AGCATAATAA TCGAATTGAA 540 
GCCCAGGAAA ATTATATTCC AGATCATGGT GGAGGTGAGG ATTCTTGTGC CAAAACAGAC 600 
ACAGGCTCAG AAAATTCTGA ACAAATAGCT AATTTTCCTA GTGGAAATTT TGCTAAACAT 660 
ATTTCAAAAA CAAATGAAAC AGAACAGAAA GTAACACAAA TATTGGTGGA ATTAAGGTCA 720 
TCTACATTTC CAGAATCAGC TAATGAAAAG ACTTATTCAG AAAGCCCCTA TGATACAGAC 780 
TGCACCAAGA AATTTATTTC AAAAATAAAG AGCGTTTCAG CATCAGAGGA TTTGTTGGAA 840 
GAAATAGAAT CTGAGCTCTT ATCTACGGAG TTTGCAGAAC ATCGAGTACC AAATGGAATG 900 
AATAAGGGAG AACATGCATT AGTTCTGTTT GAAAAGTGTG TGCAAGATAA ATATTTGCAG 960 
CAGG AACATA TCATAAAAAA GTTAATTAAA GAAAATAAGA AGCATCAGGA GCTCTTCGTA 1020 
GACATTTGTT CAGAAAAAGA CAATTTAAGA GAAGAACTAA AGAAAAGAAC AGAAACTGAG 1080 
AAGCAGCATA TGAACACAAT TAAACAGTTA GAATCAAGAA TAGAAGAACT TAATAAAGAA 1140 
GTTAAAGCTT CCAG AG ATCA ACTAATAGCT CAAGACGTTA CAGCTAAAAA TGCAGTTCAG 1200 

405 



WO 02/30268 



PCT7US01/32045 



CAGTTACACA AAG AGATGGC CCAACGGATG GAACAGGOCA ACAAGAAATQ TGAAG AGGCA 1260 
CGOCAAGAAA AAG AAGCAAT GGTAATGAAA TATGTAAGAG GTGAG AAGGA ATCTTTAGAT 1320 
CTTCGAAAGG AAAAAGAGAC ACTTGAG AAA AAACTTAGAG ATGCAAATAA GGAACTTGAG 1380 

- AAAAACACTA ACAAAATTAA GCAGCTTTCT CAGGAGAAAG GACGGTTCCA CCAGCTGTAT 1440 
D GAAACTAAGG AAGGOGAAAC GACTAG ACTC ATCAGAG AAA TAGACAAATT AAAGGAAGAC 1500 

ATTAACTCTC ACGTCATCAA AGTAAAGTGG GCACAAAACA AATTAAAAGC TGAAATGGAT 1560 
TCACACAAGG AAACCAAAGA TAAACTCAAA GAAACAACAA CAAAATTAAC ACAAGCAAAG 1620 
G AAGAAGCAG ATCAGATACG AAAAAACTGT CAGGATATGA TAAAAACATA TCAGGAGTCA 1680 

tn GAAGAAATTA AATCAAATG A GCTTO ATGCA AAGCTTAG AG TCACAAAAGG AG AACTCXJ AA 1740 

10 AAACAAATGC AAG AAAAATC TG ACCAGCTA G AGATGCATC ATGCCAAAAT AAAGGAACTA 1800 
GAAGATCTG A AG AG AACATT TAAGGAGGGT ATGGATG AGT TAAG AACACT G AG AACAAAG 1860 
GTGAAATGTC TAGAAGATGA AOGATTAAGA ACAGAAGATQ AATTATCAAA ATATAAGG AA 1920 
ATTATTAATC GCCAAAAAGC TGAAATTCAG AATTTATTGG ACAAGGTGAA AACTGCAGAT 1980 
CAGCTACAGG AGCAGCTTCA AAGAGGTAAG CAAGAAATTO AAAATTTGAA AGAAGAAGTG 2040 

I J GAAAGTCTTA ATTCTTTGAT TAATGACCTA CAAAAAGACA TCGAAGGCAG TAGGAAAAGA 2100 
GAATCTGAGC TGCTGCTGTT TACAGAAAGG CTCACTAGTA AGAATGCACA GCTTCAGTCT 2160 
GAATOCAATT CTTTGCAGTC ACAATTTGAT AAAGTTTCCT GTAGTGAAAG TCAGTTACAA 2220 
AGCCAGTOTO AACAAATOAA ACAGACAAAT ATTAATTTGG AAAGTAGGTT GTTGAAAGAG 2280 
GAAGAACTGC GAAAAGAGGA AGTCCAAACT CTGCAAGCTG AACTOGCTIX3 TAGACAAACA 2340 

ZV GAAGTTAAAG CATTGAGTAC OCAGGTAGAA G AATTAAAAO ATGAGTTAGT AACTCAGAGA 2400 
CGTAAACATG CCTCTAGTAT CAAGG ATCTC ACCAAACAAC TTCAGCAAGC ACGAAGAAAA 2460 
TTAGATCAGG TTGAGAGTGG AAGCTATGAC AAAGAAGTCA GCAGCATGGG AAGTCGTTCT 2520 
AGTTCATCAG GGTCCCTGAA TGCTCGAAGC AGTGCAGAAG ATCGATCTCC AGAAAATACT 2580 

- GGGTCCTCAG TAGCTGTGGA TAACTTTOCA CAAGTAG ATA AGGCCATGTT G ATTG AGAG A 2640 
Z J ATAGTTAGGC TGCAAAAAGC ACATGCOOGG AAAAATGAAA AGATAGAATT TATGGAGGAC 2700 

CACATCAAAC AACTGGTGGA AGAAATTAGG AAAAAAACAA AAATAATTCA AAGTTATATT 2760 
TTAOGAGAAG AATCAGGCAC ACTTTCTTCA GAGGCATCTG ATTTTAACAA AGTTCATTTA 2820 
AGTAGAOGGG GTGGCATCAT GGCATCTTTA TATACATCCC ATOCAGCTGA CAATGGATTA 2880 
ACATTQGAGC TCTCTTTGGA AATCAACCGA AAATTACAGG CTGTTTTGOA GGATACOTTA 2940 

30 CTAAAAAATA TTACTTTGAA GGAAAATCTA CAAACACTTG G AACAG AAAT AG AACGTCIT 3000 
ATTAAACACC AGCATGAACT AGAACAGAGG ACAAAGAAAA CQEAAAACAA GOCTCTTGCT 3060 
CAGTAAAG AG ACAAAAGOCA CACAGGAGTA GGTGGCACTG ACCTCTATTG TTGGAGACTT 3120 
TGTTCXACTT TTTGTTTCAG CCAGTAAAAA TATTGTTTTG CTTCATCTGT ACACAAAAAA 3180 
ATACCCTTTT ACAATATG AA TGCATTGCTG TATATACTGT AAGACTGAAA GCTTTGATGA 3240 

35 AATTTCTTTT TGTATGGTGC AATATG ACAG OCTGTCATTG AATCTAAACA ACTTAATTTO 3300 
CTTGTATTCA TAAGAAGTGT TGAACATTAC AAGGGCTTTT AT 



40 Protein Accession #: NP_060487 



45 



MVHYLSFCN YYMEFYREEL PHIDYLIDIQ FATGKVTQPG EDTSYHQCAQ LEARDEGTDS 60 
LLLNNGSS AT LKTRTKCYGT PRGLPHRSLL QPTPFTCKTK KSRFEELQS ELVPVSMSET 120 
DHIASTSSDK NVGKTPELKE DSCNLFSGNE SSKLENESKL LSLNTDKTLC QPNHHNNRIE 180 
AQENYIPDHG GGEDSCAKTD TGSENSEQIA NFPSGNFAKH BKTNETEQK VTQILVELRS 240 
STFPESANEK TYSESPYOTD CTKKFISKHC SVS ASEDLLB EIESELLSTB FAEHRVPNGM 300 
NKGEHALVLF EKCVQDKYLQ QEHHKKLIK ENKKHQELFV DICSEKDNLR EELKKRTETE 360 
KQHMNTIKQL ESRIEELNKE VKASRDQUA QDVTAKNAVQ QLHKEMAQRM EQANKKCEEA 420 
_ RQEKEAMVMK YVRGEKESLD LRKEKETLEK KLRDANKELE KNTNKIKQLS QEKGRLHQLY 480 
50 ETKEGETTRL IREIDKLKED XNSHVUCVKW AQNKLKAEMD SHKETKDKLK ETTTKLTQAK 540 
EEADQIRKNC QDMDCTYQES EEKSNELD A KLR VTKGELE KQMQEKSDQL EMHHAKIKEL 600 
EDLKRTFKEG MDELRTLRTK VKCLEDERLR TEDELSKYKE HNRQKAEIQ NLLDKVKTAD 660 
QLQEQLQRGK QEIENLKEEV ESLNSUNDL QKDDEGSRKR ESEtXLFTER LTSKNAQLQS 720 
ESNSLQSQFD KVSCSESQLQ SQCEQMKQTN INLBSRLLKE EELRKEBVQT LQAELACRQT 780 
55 EVKALSTQVE ELKDELVTQR RKHASSIKDL TKQLQQARRK LDQVESGSYD KEVSSM GSRS 840 
SSSGSLNARS S AEDRSPENT GSS V AVDNFP QVDKAM1JER IVRLQKAHAR KNEKIEFMED 900 
HIKQLVEEIR KKTKDQSYI LREESGTLSS EASDFNKVHL SRRGGIMASL YTSHPADNGL 960 
TLELSLHNR KLQAVLEDTL LKNITLKENL QTLGTEIERL IKHQHELEQR TKKT 



60 
65 
70 
75 



Nudelc Add Accession* D83760 

Coding sequence 56-1459 (undeiCned sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

TTGCOGTGAA GGGCTGTGCG GTTOCCGTGC GCGO06GAGC CTGCTGTGGC CTCTTATGCA 60 

CTOCACCACC CCCATCAGCT CCCTCTTCTC CTTCACCAGC CCCGCAGTGA AGAGACTCCT 120 

AGGCTGGAAG CAAGGAGATG AAGAGGAAAA GTGGGCAGAG AAGGCAGTGG ACTCTCTAGT 180 

GAAGAAGTTA AAGAAGAAGA AGGGAGCCAT GGACGAGCTG GAGAGGGCTC TCAGCTGCCC 240 

GGGGCACCOC AGCAAATGCG TCAOGATTCC CCGCTCCCTG GAOGGGCGGC TGCAGGTGTC 300 

CCACCGCAAG GGCCTGCCCC ATGTGATTTA CTGTCGOGTG T GG CGCTGGC CGGATCTGCA 360 

GTCCCACCAC GAGCTGAAGC C6CTGGA0TG CTGTGAGTTC CCATTTCGCT CCAAGCAGAA 420 

AGAAGTGTGC ATTAAOCCTT ACCACTACCG CCGGGTGGAG ACTCCAGTAC TGCCTCCTGT 480 

GCTCGTGCCA AGACACAGTG AATATAACCC CCAGCTCAGC CTCCTGGCCA AGTTCCGCAQ 540 

C GO CTCCCTQ CACAGTGAGC CACTCATGCC ACACAACGCC ACCTATCCTQ ACTCTTTCCA 600 

GCAGCCTCCQ TGCTCTGCAC TCCCTCCCTC ACCCAGCCAC GCGTTCTCCC AGTCCCCGTG 660 

CACGGCCAGC TACCCTCACT CCCCAGGAAG T0CTTCTGA6 OCAGAGAGTC CCTATCAACA 720 

CTCAGTPGAC ACACCACCOC TGCCTTATCA TGCCACAGAA GCCTCTGAGA CCCAGAGTSO 780 

406 
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CCAACCTGTA GATGCCACAG CTGATAGACA TGTA3TGCTA TCQATACCAA ATGGAGACTT 840 

TCGACCAGTT TGTTACGAGG AGCCCCAGCA CTCGTGCTCO GTCGCCTACT ATGAACTGAA 900 

CAACCGAGTT GGGGAGACAT TCCAG6CTTC CTCCCGAAGT GTGCTCATAG ATCGGTTCAC 960 

COACCCTTCA AATAACAGGA ACAGATTCTG TCTTGGACTT CTTTCTAATG TAAACAGAAA 1020 

CTCAACGATA GAAAATACCA GGAGACATAT AGGAAAGGOT GTGCACTTGT ACTACGTCGG 1080 

GGGAGAGGTG TATGCCGAGT GOGTGAGTGA CAGCAGCATC TTTGTGCAGA GCCGGAACTG 1140 

CAACTATCAA CACGGCTTCC ACCCAGCTAC CGTCTGCAAG ATCCCCAGCG GCTGCAGCCT 1200 

CAAGGTCTTC AACAAOCAGC TCTTCGCTCA GCTCCTGGCC CAGTCAGTTC ACCACGGCTT 1260 

TGAAGTCX3TG TATGAACTGA OCAAGATGTG TACTATCCGG ATGAGTTTTG TTAAGGGTTG 1320 

GGGTGCTGAG TATCATCGCC AGGATGTCAC CAGCACCCCC TGCTGGATTG AGATTCATCT 1380 

T CATGGGCC A CTGCAGTGGC TGGACAAAGT TCTGACTCAG ATGGGCTCTC CACATAACCC 1440 
CATTTCTTCA GTGTC TTAAC AGTCATGTCT TAAGCTGCAT TTCCATAGGA T 



SB) ID WOflg PBJ6 Protein sequencg 
Protein Accession ft NP.005B98 • 

MHSTTPBSL FSFTSPAVKR LLGWKQGDEE BKWAEKAVDS LVKKLKKKKG AMDELERALS 60 
CPGQPSKCVT IPRSLDGRLQ VSHRKGLPHV IYCRVWRWPD LQSHHELKPL ECCEFPFGSK 120 
QKEVONPYH YRRVETPVLP PVLVPRHSEY NPQLSLLAKF RSASLHSEPL MPHNATVPDS 180 
FQQPPCSAUP PSPSHAFSQS PCTAS YPHSP GSPSEPESPY QHSVDTPPLP YHATEASETQ 240 
SGQPVDATAD RHWLSIPNO DFRPVCYEEP QHWCS VAYYB LNNRVGETFQ ASSRSVUDG 300 
FTDPSNNRNR FCXGLLSNVN RNSTTENTRR HIGKGVHLYY VGQBVYAECV SDSSIFVQSR 360 
NCNYQHGFHP ATVCKIPSGC SLKVFNNQLF AQLLAQS VHH GFEWYELTK MCITRMSFVK 420 
GWGAEYHRQD VTSTPCWIH HLHGPLQWLD KVLTQMGSPH NPISSVS 



SK)!DMO:254PBJ8DMA a «m e n C6 
Nucleic Acid Accession* AB04684 

Coing sequence: 472-4377 {underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

TGCAGGTTTG CAGGGTCTGA GATTACTTGG GCTTTTOCTG CCTTTTTCTT TTGCTTAAGG 60 

GATGGACAAG GAGCTGAGAT TTATGACCCT TATTAGAGAA AAAAATGTGC CTTGCTAGGG 120 

TGGGGACACT TGGTTGATGC AGTCTCTCTC 'ICICTTTCIX: GGTGTTTATA ACAAAACAAA 180 

ACCAAAATGA ACTGAGGGGT TTGTAATGGT AGTTTGTTTG TTGCTGGAGA ATGCTACTTT 240 

GCATGCTTTT TTTCTCTTGC AGGGTATGTT C ls y iWlVfU CTTTTTCTTT TAGAAGCTAC 300 

TAAAGGGTGT TGGGGATGCT TCTGACTATT ATGAAGGCCA AAAGGCCTGT TGACT GG GGC 360 

TGCTTTTAAC CCTTTCCTAT TTGCTGAGAA TGCAGCCGTO TGACAGTAAC TGAACATTGG 420 

TCTAAAGTCT TTCCAAAAGG TCAAGGTTCA CAAGAACATC TGCTCAAATT AATGACCATO 480 

GGGGATATGA AGADCCCAGA CTTTGATGAC CTCCTGGCAG CATTTGACAT CCCAGATATG 540 

GTCGATCCTA AAGCAGCTAT TGAGTCTGGA CACGATGACC ATGAAAGCCA CATGAAGCAG 600 

AATGCTCACG GAGAGGATGA CTCCCACGCA CCATCATCTT CTGATGTGGG TGTCAGCGTT 660 

ATCGTCAAGA ATGTTCGGAA CATTGACTCT TCCGAGGGCG GGGAGAAAGA CGGCCACAAC 720 

CCCACTGGCA ATGGCTTACA TAATGGGTTT CTCACAGCAT CCTCCCTTGA CAGTTACAGT 780 

AAAGATGGAG CAAAGTCCTT GAAAGGAGAT GTGCCTGCCT CTGAGGTGAC ACTGAAAGAC 840 

TCGACATTCA GCCAGTTTAG CCCGATCTCC AGTGCTGAAG AGTTTGATGA CGACGAGAAG 900 

ATTGAGGTGG ATGACCCCOC TGACAAGGAG GACATGCGAT CAAGCTTCAG GTCGAATGTG 960 

TTGACGGGGT CGGCTCCCCA GCAGGACTAC GATAAGCT6A AGGCACTCGG AGGGGAAAAC 1020 

TCCAGCAAAA CTGGACTCTC TACGTCAGGC AATGTGGAGA AAAACAAAGC TGTTAAGAGA 1080 

GAAACAGAAG CCAGTTCTAT AAACCTGAGT GTTTATGAAC CTTTTAAAGT CAGAAAAGCA 1140 

GAGGATAAAT TGAAGGAAAG CTCTGAGAAG GTGCTGGAAA ACAGAGTCCT AGATGGGAAG 1200 

CTGAGCTCCG AGAAGAATGA CACCAGCCTC COCAGCGTTG CGCCATCAAA GACAAAGTCG 1260 

TCCTCCAAGC TCTCGTCCTG GATCGCTQCC ATCGCGGCTC TCAGCGCTAA AAAGGCGGCT 1320 

TCAGACTCCT GCAAAGAAOC AGTGGCCAAT TOGAGGGAAT C C iax w rr ACCAAAAGAA 1380 

GTAAATGACA GTCCGAGAGC CGCTCACAAG TCTCCTGAAT CCCAGAATCT CATCGACGGG 1440 

ACCAAAAAAC CATCCCTGAA GCAACCGGAT AGTCCCAGAA GCATCTCAAG TGAGAACAGC 1500 

ACCAAAGGAT CCCCGTCCTC TCCCGCAGGG TCCACACCAO CAATCCCCAA AGTCCGCATA 1560 

AAAACCATTA AGACATCTTC TGGGGAAATC AAGAGAACAG TGACCAGGGT ATTGCCAGAA 1620 

GTGGATCTTG ACTCTCGAAA GAAACCTTOC GAGCAGACAG CGTCCGTGAT GGCCTCTGTG 1680 

ACATCCCTTC TGTCGTCTCC AGCATCAGCC GCCGTCCTTT CCTCTCCOCC CAGGGCGCCT 1740 

CTCCAGTCTG CGGTCGTGAC CAATGCAOTT TCCCCTGCAG AGCTCACCCC CAAACAGGTC 1800 

ACAATCAAGC CTGTGGCTAC TCCTTTCCTC CCAGTGTCTG CTGTGAAGAC GGCAGGATCC 1860 

CAAGTCATTA ATTTGAAGCT CGCTAACAAC AOCACGGTGA AAGCCACGGT CATATCTGCT 1920 

GCCTCTGTCC AGAGTGCCAG CAGCGCCATC ATTAAAGCTG CCAACGCCAT CCAGCAGCAA 1980 

ACTGTCGTGG TGCCGGCATC CAGCCTGGOC AATGCCAAAC TCGTGCCAAA GACTGTGCAC 2040 

CTTGCCAACC TTAACCTTTT GCCTCAGGGT GCCCAGGCCA CCTCTGAACT OCGCCAAGTG 2100 

CTAACCAAAC CTCAGCAACA AATAAAGCAG GCAATAATCA ATGCAGCAGC CTCGCAACCC 2160 

CCCAAAAAGG TGTCTCGAGT CCAGGTGGTG TCGTCCTTGC AGAGTTCTGT GGTGGAAGC? 2220 

TTCAACAAGG TGCTGAGCAG TGTCAATCCA GTCOCTGTTT ACATCCCAAA CCTCAGTCCT 2280 

CCOGCCAATG CAGGGATCAC GTTACCGACG CGTGGGTACA AGT6CTTGGA GTGTGGGGAC 2340 

TCCTTTGCAC TTGAAAAGAG TCTGACCCAG CACTACGACA GACGGAGCGT GCGCATCGAA 2400 

GTAACQTGCA ACCATTGTAC AAAGAACCTC GTTTTTTACA ACAAATGCAG aWHtC 2460 

CAIGCCCGTG GGCATAAGGA GAAAGGGGTO GTAATGCAAT GCTCCCACTT AATTTTAAAG 2520 

CCAGTCCCAG CAGATCAAAT GATAGTTTCT COGTCAAGCA ATACTTCCAC TTCAACTTCC 2580 

ACTCTTCAGA GCCCTGTGGG AGCTGGCAGA CACACTOTCA CAAAAATTCA GTCTGGCATA 2640 

ACTGGGACAG TCATATCGGC TOCTTCAAGC ACTCCCATCA CCCCAGCCAT GCOCCTAGAT 2700 

GAAGAOCCCT CCAAACTGTG TAGACATAGT CTAAAATGTT TGGAGTGTAA TGAAGTCTTC 2760 

CAGGACGAGA CATCACTGGC TACACATTTC CAGCAGGCTG CAGATACGAG TGGACAAAAG 2820 



407 



WO 02/30268 



ACTTGCACTA TCTGCCAGAT GCTGCTTCCT AACCAGTGCA GTTATGCATC ACACCAGAGA 2680 

ATCCATCAGC ACAAATCTCC CTACACCTGC CCTGA6TQT0 GGGCCATCTG CAGGTCGGTQ 2940 

CACTTCCAGA CCCACGTCAC CAAGAACTGT CTGCACTACA CGAGGAGAGT TGGTTTTCGA 3000 

TGTGTGCATT GC AATGTT GT GTACTCTGAT GTGGCTGCTC TGAAGTCTCA CATTCAAGGT 3060 

TCTCACTGTG AAGTCTTCTA CAAGTGTCCT ATTTGTCCAA TOGOGTTTAA GTCTGCCCCA 3120 

AGCACACATT CCCACGCCTA CACACAGCAT CCTGGCATCA AGATAGGAGA ACCAAAAATA 3180 

ATATATAAGT GTTCCATGTG CGACACTGTG TTCACCCTGC AAACCTTGCT GTATCGCCAC 3240 

TTTGACCAAC ACATTGAAAA CCAGAAGGTG TCTGTTTTCA AGTGTCCAGA CrGTT C TCTT 3300 

TTATATGCAC AGAAGCAACT TATGATGGAC CATATCAAGT CTATGCATGG AACATTGAAA 3360 

AGTATTCAAG GGCCTCCAAA CTTGGGTATA AACTTGCCTT TGAGCATTAA GCCTGCAACT 3420 

CAAAATTCAG CAAATCAGAA CAAAGAGGAC ACCAAATCCA TGAATGGGAA AGAGAAATTG 3480 

GAAAAGAAAT CTCCATCTCC TGTGAAAAAA TCAATGGAAA CCAAGAAAGT GGCCAGTCCT 3540 

GGGTGGAGGT GTTGGGAGTG TGACTGCCTG TTCATGCAGA GAGATGTGTA CATATCCCAC 3600 

GTGAGGAAGG AGCACGGGAA GCAAATGAAG AAACACCCCT GCOGOCAGTG TGACAAGTCT 3660 

TTCAGCTCGT CCCACAGCCT GTGCCGGCAC AACCGGATCA AGCACAAAGG CATCAGGAAA 3720 

GTOTACGCCT GCTCGCACTG CCCAGACTCC AQAOGTACCT TTACCAAACG TTTGATGCTG 3780 

GAGAAGCACG TCCAGCTGAT GCATGGCATC AAGGACOCTG ACCTGAAAGA AATGACAGAT 3840 

GCCACCAATG AGGAGGAAAC AGAAATAAAA GAAGACACTA AGGTCCCCAG TCCCAAGCGG 3900 

AAGTTGGAAG AACCAGTTCT GGAGTTCAGG CCTCCCCGAG GAGCAATCAC TCAACCACTG 3960 

AAAAAGCTOA AAATCAATGT TTTTAAGGTT CACAAGTGTG CCGTGTGTGG CTTCACCACC 4020 

GAAAACCTGC TGCAATTCCA CGAACACATC OCTCAGCACA AATCGGATGG TTCTTCCTAC 4080 

CAGTGCCGGG AGTGTGGCCT CTGCTACACG TCTCACGTCT CTCTGTCCAG GCACCTCTTC 4140 

ATCGTACACA AGTTAAAGGA ACCTCAGCCA GTGTCCAAGC AAAATGGGGC TGGGGAAGAT 4200 

AACCAACAGG AGAACAAACC GAGCCACGAG GATGAATCCC CTGATGGCGC CGTGTCAGAC 4260 

AGAAAGTGCA AAGTGTGCGC AAAAACTTTT GAAACTGAAG CTGCCTTAAA TACTCACATG 4320 

CGGACACACG GCATGGCCTT CATCAAATCC AAAAGGATGA GCTCAGCCGA GAAATAGCCA 4380 

CAGATGCTCC ATGAGGAAAA TCCCTGTCCA CATTGGAATA AAAAAGACAT TTTTGTTACA 4440 

AAGTTTGCAG TATAATAGAG TTAACAGTAC TCTCTAGGCT GTTGCAATAT ATTCTCTTTC 4500 

AAT6TAOCTT OCTTCACCTC GTCGTATATA TOCTCGATAA GTATTAAAAC AGTATTTGAG 4560 

TTTAAAAGAG TTTGTATATA TTTAAATGAA TAACTTTTTA TACTCTTTGT TACATGTTTG 4620 

TATCAGTATT TAGTGGAAAA CCATTTGAGT TGTTTTGGGT TAGAATTTTT CTTTTTGTAC 4680 

T O mWCTSTA AAACAGAGTT CTTAGTAACA GGGGCAGTTC CTGAATTCAA ATAAACCATT 4740 

TTGTATGTTT GGATTTTGAA TGGGTTAACT AATTACAGGC TAAAATAATG CCTTTTTTAG 4800 

TGTTTTTAAT TTTTAGAATT CACTACATAA ATTGTAAGTA ATTGTGGGTC TCAAAAACAC 4860 

TAGGAACTTT TAAGTGTCTT AGCACTTCCT CGATGTGCCT GCCCTGAGGG AGTGAGTTCA 4920 

CATTTGAGAC AACTGCACTC CAGTGTGGAC GTGCCTTTGT CTTCAGGCCA TGCCGAAGGG 4980 

TGTTTAAAGC AGTCTTGCAG GTCGCTCCTT TCCCAGCCGT GGATAAAAAC TGAAGCTAGG 5040 

AATCTAATAA GGAATGCTGA TTTCCTCAGT TCCATTTTGA GGAATGGGGA AGGCTATTCT 5100 

AAAGAAAAAA ATGGGATTTG TTTTCTCGGC AGATCTGCAA GGCTGGCTTT AAGAGCACAA 5160 

GGAGGGAAAG TAACGAAAGG GCTGGACTAC TATAAAAGTT ACAAATACGT AGTTAGACCA 5220 

AIAGATFTAT ATAGTCAGGT TTTTGTCATG TAATTTATTA ACTAACTATT ACAGAAACAC 5280 

AGCTAAGAAT ATCAAGTATT TCTCTGGCTC TTGACAGAAA AAAATCAGTT GACTTAACCC 5340 

TTTGCTGTCA AAAGAGTTGG CGTTTCCTGT TCTGGGTGCT ACTGCCAAAC GTTATGGTAC 5400 

TTAGAGTCGG GATGCACAAC TTCAACCACC GACTTATCAA TGCAGCCGCC TGTCTATTGC 5460 

AATTGGCCGT TACCTTAAGC ACTGAGCCAC CCGGGTTTAG TTCAGCCATT TCAAGAAGTA 5520 

TATTTAACGT CGGTAGTTCT GCTTTATTAA AATGCAGCAG AGGTACTCTT CTGTCCCTTC 5580 

CGTTTATAGT TCTCTGAGAG AGTTCTATTT TTTGGTTTTG TTTTGTGTTT TCTTTTGCAT 5640 

TTTGTATCTT GTATTTATCC CTGAACATGT TTTCTACCTT TTTTTTTTTT TTTTTTTTAA 5700 

GAAAAGGAAT TCTTTTGTGT ATATATAGAT ACTTGCATGA TATACTGTAG TCAATGTTCG 5760 

GTTCCTCAAA AGGTCTTGCT GCTGTCAGGT GTTATGCACT CCATCCATCA TAACTGTATG 5820 
AAACACATTT CATATGTAAA TAAACGTGGG ACATTTG 



gEQ p was? Pftff Pi# seowre 

Protein Accasstani: BAB13455 

MKTPDFDDLL AAFDIPDMVD PKAAIESGHD DHESHMKQNA HGEDDSHAPS SSD VGVSVIV 60 
KNVRNIDSSE GGEKDGHNFT GNGLHNGFLT ASSLDSYSKD GAKSLKGDVP ASEVTLKDST 120 
FSQFSPISS A EEFDDDEKIE VDDPPDKEDM RSSFRSNVLT OS APQQDYDK LKALGGENSS 180 
KTQLSTSGNV EKNKAVKRET EASSINLSVY EPFKVRKAED KLKESSDKVL ENR VLDGKLS 240 
SEKNDTSLPS VAPSKTKSSS KLSSCIAAIA ALSAKKAASD SCKEPVANSR ESSPLPKEVN 300 
DSPRAADKSP ESQNUDGTK KPSLKQPDSP RSISSENSSK GSPSSPAGST PAIPKVRIKf 360 
DCTSSGELKR TVTRVLPEVD LDSGKKPSEQ TASVMASVTS LLSSPASAAV LSSFPRAPLQ 420 
SAWTNAVSP AELTPKQVn KPVATAFLPV SAVKTAGSQV INLKLANNTT VKATVISAAS 480 
VQSASSAUK AANAIQQQTV WPASSLANA KLVPKTVHLA NLNLLPQGAQ ATSELRQVLT 540 
KPQQQIKQAI INAAASQPPK KVSRVQWSS LQSSWEAFN KVLSSVNPVP VYIPNLSPPA 600 
NAGITLPTRG YKCLECGDSF ALEKSLTQHY DRRSVREBVT CNHCTKNLVF YNKCSLLSHA 660 
RGHKEKGWM QCSHULKPV PADQMIVSPS SN TS TS 1 S T L QSPVOAOTHT VTKIQSGITG 720 
TVISAPSSTP ITPAMPLDED PSKLCRHSLK CLECNEVFQD ETSLATHPQQ AADTSGQKTC 780 
TICQMLLPNQ CSYASHQRJH QHKSPYTCPE OOAICRSVHF QTHVTKNCLH YTRRVGFRCV 840 
HCNWYSD VA ALKSHIQGSH CEVFYKCPIC FMAFKS APST HSHAYTQHPG KIGEPKHY 900 
KCSMCDTVFT LQTLLYRHFD QHIENQKVSV FKCPDCSLLY AQKQLMMDHI KSMHGTLKSI 960 
EGPPNLGINL PLSKPATQN SANQNKEDTK SMNGKEKLEK KSPSPVKKSM ETKKVASPGW 1020 
TCVVECDCLFM QRDVYISHVR KEHGKQMKKH PCRQCDKSFS SSHSLCRHNR IKHKGIRKVY 1080 
ACSHCPDSRR TFTKRLMLEK HVQLMHGIKD PDLKEMTDAT NEEETEIKED TKVPSPKRKL 1140 
EEPVLEFRPP RGAITQPLKK UONVFKVHK CAVOGPTTEN LLQFHEHffQ HKSDGSSYQC 1200 
REOGLCYTSH VSLSRHLFIV HKLKEPQPVS KQNGAGEDNQ QENKPSHEDE SPDGAVSDRK 1260 
CKVCAKTFET EAALNTHMRT HGMAFIKSKR MSSAEK 
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Nucleic Acid Accession* AF111B47 

Coding sequence: 58-1603 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I „ I I I I I 

TTTTCGTCOA CTCTTACCGG TTGGCTGGGC CA6CTGCGCC GCGGCTCACA GCTGACGATG 60 

GGGGACCCCA GCAAGCAGGA CATCTTGACC ATCTTCAAGC GCCTCCGCTC GGTGCCCACT 120 

AACAAGGTGT GTTTTGATTG TGGTGCCAAA AATCCCAGCT GGGCAACCAT AACCTATGGA 180 

GTGTTCCTTT GCATTGATTG CTCAGGGTCC CACCGGTCAC TTGGTGTTCA CTTGAGTTTT 240 

ATTCGATCTA CAGAGTTGGA TTCCAACTGG TCATGGTTTC AGTTGCGATG CATGCAAGTC 300 

GGAGGAAACG CTAGTGCATC TTCCTTTTTT CATCAACATG GGTQTTCCAC CAATQACACC 360 

AATGCCAAGT ACAACAGTOG TGCTGCTCAG CTCTATA6GG AGAAAATCAA ATCGCTCGCC 420 

TCTCAAOCAA CACGGAAGCA TGGCACTGAT CTGTGGCTTG ATAGTTGTGT G6TTCCACCT 480 

TTSTCCCCTC CACCAAAGQA GGAAGATTTT TTTOCCTCTC ACGTTTCTCC TGAGGTGAGT 540 

GACACAGCGT GGGCATCAGC AAIAGCAGAA CCATCTTCTT TAACATCAAG GCCTGTGGAA 600 

AOCACTTTGG AAAATAATGA AGGTGGACAA GAGCAAGGAC CAAGTGTG6A AGGTCTTAAT 660 

GTACCAACAA AGGCTACTTT AGAG6TATCC TCTATCATAA AAAAGAAACC AAATCAAGCT 720 

AAAAAAGGCC TTGGGGCCAA AAAAGGAAGT TTGGGAGCTC AGAAACTGGC AAACACATGC 760 

TTTAATGAAA TTGAAAAACA AGCTCAAGCT GCGGATAAAA TGAAGGAGCA GGAAGACCTO 840 

GCCAAGGTGG TATCTAAAGA AGAATCAATT GTTTCATCAT TACGATTAGC CTATAAGGAT 900 

CTTGAAATTC AAATGAAGAA AGACGAAAAG ATGAACATTA GTGGCAAAAA AAATGTTGAC 960 

TCAGACAGAC TCGGCATGGG ATTTGGAAAT TGCAGAAGTG TTATTTCACA TTCAGTGACT 1020 

TCAGATATGC AGACCATAGA GCAGGAATCA CCCATTATGG CAAAACCAAG AAAAAAGTAT 1080 

AATGATGACA GTGACGATTC ATATTTTACT TCCAGCTCAA GTTACTTTGA CGAGCCAGTG 1140 

GAGTTAAGGA GCAGTTCTTT CTCTAGCTGG GATGACAGTT CAGATTCCTA TTGGAAAAAA 1200 

GAGACCAGCA AAGATACTGA AACAGTTCTG AAAACCACAG GCTATTCAGA CAGACCTACT 1260 

GCTCGCCGCA AGCCAGATTA TGAGCCAGTT GAAAATACAG ATGAGGCCCA GAAGAAGTTT 1320 

GGCAATGTCA AGGCCATTTC ATCAGATATG TATTTTGGAA GACAATCCCA GGCTGATTAT 1380 

GAGACCAGGG CCCGCCTAGA GAGGCTGTCG GCAAGTTCCT CCATAAGCTC GGCTGATCTG 1440 

TTCGAGGAGC CGAGGAAGCA GCCAGCAGGG AACTACAGCC TGTCCAGTGT GCTGCCCAAC 1500 

GCCCCCGACA TGGCGCAGTT CAAGCAGGGA GTGAGATCGG TT6CTCCAAA ACTCTCO G TC 1560 

TTTGCTAATG GAGTCGTGAC TTCAATTCAG GATCGCTACG GTTC TTAA TA CTGAAGTCAT 1620 

GATGTGTATT TCCTGGAGAA ATTCCTCTTT AAATGAAGAA GTAACCACAT CTCAGGCGGC 1680 

AGTGAAGTCC AGATAGTTTT GCAGATTGTT TTGCTACTTT TTCATATGGT ATATGTTTCT 1740 

GATTTTTAAT ATTTCTTTTG AGAAATTCTG AGTTCTGATG TAGGAGCTTT CCTGTGATTT 1800 

CTGTTTCACG TTCCTTCCTG TCACACCCTC CTTTGGCGTC TCTGTGTATA TCCTTGCTTT 1860 

ATTTTCTTGG AACCTTTGAT TTCAACACTG AGGGCCTGGA GACCTCGGCT CCTCCTGCTC 1920 

CTGAACCAGG AGGCTTCATG TGGGGGAGGA GGAGAGGTCT CCATGTGACA CATGGGCTCA 1980 

GGGCTGCCAG AATCAGCGGA TGCTGGATGG GCCTGCAGAA ACAACACTCA CCACACACAC 2040 

TTCCTTCAAA AGACGAAAAG TGACTGGTGT CTCGTGTGAC AGATTGCTTC ATTTATGTTT 2100 

CTACATAGTA AGGTGACTGC CAA ATAATAT TTGAAGTCAT CTQTCTCTTT GTAAATTATT 2160 

TTATATGACC TATAAATTTA AAAATOTTTT TCAGTGAGTG CTTTTAACAA ACTTAAGCTT 2220 

CTGCCCTGCC AAGGGAATTA ATGTTATCTT GTGAAAGGTG TT G C TCT T T Q AATTGATGAG 2280 

AAATGGAAGA TGAGAACTCC CTAAGAGTTC TCATAATAAA TCATCTCATC ACAAATCAAT 2340 

ACGGTAXACA GAGTTAAAGT GGAATGAGGT AAGAAGATAC AGCTACAGAA AATAGTTGOG 2400 

TGTATGGGAG AACAGTCATT GTAATTGGGT AGTTTTGTTA ATAAATATTT TTAAATCTTG 2460 

CTTTTCAGAA ATTACCGAAT GTGTATAAAC AAATAAAGAA AAATAATTTA GCTGTGTTTT 2520 

AGACAGCATT AGAATATATT GTTCAGCACA GTAAAATATA TTTGAAATTT GATAAGCCAA 2580 

AAATGTGGTT TTGAATGAAT ATTTTGTGAA TCTTTCTTAA AAGCTCAAAT TTGTAGACTT 2640 

CTAAATAGAA TAAACACTTG CAGCAGAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2700 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2760 
AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



SEQ ID M03B7 PBM1 Proten sequence: 
PBM1 Protein sequence: CAB76901 

MGDPSKQDIL TIFKRLRSVP TNKVCFDCGA KNPSWASITY GVFLODCSG SHRSLGVHLS 60 
FIRSTELDSN WSWFQLRCMQ VGGNASASSF FHQHGCSTND TNAKYNSRAA QLYREKKSL 120 
ASQATRKHGT DLWLDSCWP FLSPPPKEED FFASHVSPEV SDTAWASAIA EPSSLTSRPV 180 
ETTLENNEGO QEQGPS VEGL NVPTKATLEV SSUKKKFNQ AKKGLGAKKG SLGAQKLANT 240 
CFNEIEKQAQ AADKMKEQED LAKW5KEES IVSSLRLAYK DLEIQMKKDE KMNBGKKNV 300 
DSDRLGMGFG NCRSVISHS V TSDMQTIEQE SPIMAKPRKK YNDDSDDS YF TSSSSYFDEP 360 
VELRSSSFSS WDDSSDSYWK KETSKDTETV LKTTGYSDRP TARRKPDYEP VENTDEAQKK 420 
FGNVKAISSD MYK3RQSQAD YETRARLERL S ASSSISSAD LFEEPRKQPA GNYSLSSVLP 480 
NAPDMAQFKQ GVRSVAGKLS VFANG WTSI QDRYGS 



S^fPNP^8PPM4PNAsgMKg 
Nudeic Add Accession* D30891 

Coding sequence: M032 (undenlned sequence corresponds to start and stop codon) 

AJSG ATACTG TCATG AAGCA G ACACATGCT GACACACCTG TTG ATCATTO TCT ATCTGGC 60 
ATAAG AAAGT GTAGCAGCAC CTTTAAGCTT AAAAGTGAAG TCAACAAGCA TGAAACAGCC 120 
CTTC AAATGC AG AATOCAAA TTTGAACAAT AAAG AATGTT GTnCACCTT TAjCGTFGAAT 180 
GGAAACTOCA GAAAATTAG A CCGTAGTOTO TXTACAGCAT ATGGTAAAOC CAGCGAG AGT 240 
ATCTACTCAG CCCTGAGTGC TAATG ACTAT TTCAGTGAAA GG ATAAAGAA TCAGTTTAAT 300 
AAGAACATTA TTGTTTATGA AG AAAAGACA ATAGATGGAC ATATAAATTT AGGAATGCCT 360 
CTCAAGTGCC TGCCTAGTGA TTCTCATTTT AAAATTACAT TTGGTCAAAG AAAGAGTAGC 420 
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AAAG AAGATC G ACACATATT ACGCCAATGT G AAAATCCAA ACATGGAATC CATTCTTTTT 480 
CATGTIGTTG CTATAGGAAG GACAAGAAAG AAGATTGTTA AGATCAACGA ACTTCATGAA 540 
AAAGGAAGTA AACTITGTAT TTATGOCTTG AAGGGTGAGA CTATTGAAGG AGOCTTATGC 600 
AAGG ATGGCC GnTTCGGTC TG AGATAGGT GAATTTG AAT GGAAACTAAA GG AAGGTCAT 660 
AAG AAAATTT ATGG AAAACA GTCCATGGTG G ATG AAGTAT CTGG AAAAGT CTTAGAAATG 720 
GACATTOCAA AAAAAAAAGC ATTACAACAG AAAGATATOC ATAAAAAAAT TAAACAGAAT 780 
GAAAGTGCCA CTGATG AAAT TAATCAOCAG AGTCTOATAC AGTCTAAGAA AAAAGTCCAC 840 
AAACCAAAGA AAGATGGAGA GACCAAAGAT GTAGAACACA GCAGAGAGCA AATTCTCCCA 900 
CCTCAGGATC TAAGOCATTA TATTAAAGAT AAAACTOGCC AGACAATFCC CAGG ATTAGA 960 
AATTATTACT TTTGTAGTTT GOCGOGAAAA TATAGGCAAA TAAACTCACA AGTTAGACGG 1020 
AGGCCGCATC TGGGTAGGCG GTATGCTATT AATCTGGATG TCCAAAAGGA GGCAATTAAT 1080 
CTCTTAAAG A ATTATCAAAC GTTGAATGAA GCCATAATGC ATCAGTATCC GAATTTTAAA 1140 
GAGGAGGCAC AGTGGGTAAG AAAATATTTT CGGGAAGAAC AAAAG AGAAT GAATCTTTCA 1200 
CCAGCTAAGC AATTCAACAT ATATAAAAAG GACTTCGGAA AAATGACTGC AAATTCTGTT 1260 
TCAGTTGCAA CCIGCGAACA GCTTACATAT TATAGCAAGT CAGTTGGGTT CATGCAATGG 1320 
GACAATAATG G AAACACAGG TAATGCTACT TGCTTTGTCT TCAATGGTGG TTATATTTTC 1380 
ACCTGTCGAC ATGTTGTACA TCTTATGGTO GGTAAAAACA CACATCCAAG TTTGTGGGCA 1440 
GATATAATTA GCAAATGTGC G AAGGTAACC TTCACTTATA CAGAGTTCTG CCCTACTCCT 1500 
GACAATTGGT TTTCCATTGA GCCATGGCTT AAAGTGTCCA ATGAAAATCT AGATTATGCC 1560 
ATTTTAAAAC TAAAAG AAAA TGG AAATGCG TTTCCTCCAG GACTATGGOG ACAGATTTCT 1620 
CCTCAACCAT CTACTGGTTT GATTTATTTA ATTGGTCATC CTGAAGGCCA GATCAAGAAA 1680 
ATAGATGGTT GTACTGTGAT TCCTCTAAAC GAACGATTGA AAAAATATCC AAACGATTGT 1740 
CAAGATGGGT TGGTAGATCT CTATGATACC ACCAGTAATG TATACTGTAT GTTTACCCAA 1800 
AGAAGTTTCC TATCAGAGGT TTGG AACACA CACACGCTTA GTTATGATAC T1X5TTTCTCT 1860 
GATGGGTOCT CAGGCTOCOC AGTGTTTAAT GCATCTGGCA AATTGGTTGC TTTOCATAOC 1920 
TTTGGGCTTT TTTATCAACG AGGATTTAAT GTGCATGCCC TTATTGAATT TGGTTATTCT 1980 
ATGGATTCTA TTCTTTGTGA TATTAAAAAG ACAAATGAGA GCTTGTATAA ATCATTAAAT 2040 
GATGAGAAAC TTGAGACCTA CGATGAAGAG AAAGCCCGGC CCAGGCCAGC CTACCGGCGA 2100 
CTAGGATGCT TTCGCTTTOG CTCTOGCTTT CCAATACTCG GG ACTGGGGA AAOCGGGAG A 2160 
ATAGAAGCAG GCAAGGACCG CGGTGGGCAC GGGGTCAGTG AGACAGGGTC CTGCTCGOGG 2220 
CGTCAAGGAG GAGCGCTGTG GGTGTCCCCA GCGCAGCCAA TCGGCTFOCG AAGTAGCTGG 2280 
AGCTCTGG AG CCTTTGCTTC CTCAAATACG AGCGGGAACT GCGTTG AGCG CTGGATTOC A 2340 
GGCCGAGTGC TGGCGAGGCG CGCAGTCTCT AAAGAGCAAC AGAATAATTG CAGTACTTCT 2400 
CTAATG AGG A TGGAGTCTAG AGGAG AOOCA AGAGCCACAA CTAATACCCA GGCTCAAAGA 2460 
TTCCATTCAC CTAAGAAAAA TCCAGAAGAC CAGACCATGC COCAAAATAG GACAATATAT 2520 
GTTACCTTGA AGGCTGTCAG AAAAGAGATA GAAACTCACC AAGGCCAAGA AATGCTTGTG 2580 
CGTGGCACAG AAGGAATCAA AGAGTACATA AAOCTTGGAA TDCCCCTCAG TTGTTTGCCT 2640 
GAAGGTGGCC AGGTGGTCAT TACATTTTOC CAAAGTAAAA GTAAGCAGAA GGAAGATAAC 2700 
CACATATTTG GCAGGCAGGA CAAAGCATCG ACTGAATGTG TCAAATTTTA CATTCATGCA 2760 
ATTGGAATTG GGAAGTGTAA AAG AAGGATT GTTAAATGTG GGAAGCTTCA CAAAAAGGGG 2820 
CGCAAACTCT GTGTTTATGC TTTCAAAGG A GAAAGCATCA AGGATGCACT GTGCAAGGAT 2880 
GGCAGATTTC TTTOCTTTCT GGAG AATGAT GATTGG AAAC TCATTGAAAA CAATGACACC 2940 
ATTTTAGAAA GCAOCCAGCC AGTTGATGAA TTAGAAGGCA GATACTTTCA GGTTGAGGTT 3000 
GAG AAAAGAA TGGTCCCCAG TGCAGCAGCT TCTCAGAATC CTGAGTCAGA GAAAAGAAAC 3060 
ACCTGTGTGT TG AGAGAACA AATCGTGGCT CAGTACCCCA GTmiAAAAG AGAAAGTG AA 3120 
AAAATCATTG AAAACTTCAA GAAAAAAATG AAAGTAAAAA ATGGGG AAAC ATTATTTGAA 3180 
TTGCATAGAA CAACGTTTGG GAAAGTAACA AAAAATTCTT CTTCG ATT AA AGTAGTGAAA 3240 
CTICTTGTAC GTCTCAGTGA CTCAGTTGGG TACTTATTCT GGGACAGTGC AACTACGGGT 3300 
TACGCCACCT GCTTTGTTTT TAAAGGATTG TTCATTTTAA CTTGTCGGCA TGTAATAGAT 3360 
AGCATTGTGG GAGACGGAAT AGAGCCAAGT AAGTGGGCAA CCATAATTGG TCAATGTGTA 3420 
AGGGTG ACAT TTGGTTATG A AG AGCTAAAA G ACAAGGAAA CAAACTACTT TTTTGTTG AA 3480 
OCTTQGTTTQ AGATACATAA TGAAGAGCTT GACTATGCTG TCCTGAAACT GAAGG AAAAT 3540 
GGACAACAAG TACCTATGGA ACTATATAAT GGAATTACTC CTGTGCCACT TAGTGGGTTG 3600 
ATACATATTA TTGGCCATOC ATATGGAGAA AAAAAGCAGA TTGATGCTTG TQCTGTGATC 3660 
CCTCAGGGTC AGCGAGCAAA GAAATGTCAG GAACGTGTTC AGTCTAAAAA AGCAGAAAGT 3720 
CCAGAGTATG TCCATATGTA TACTCAAAGA AGTTTCCAGA AAATAGTTCA CAACCCTGAT 3780 
GTGATTACXrr ATGACACTGA ATTTTTCTTT GGGGCTTCCG GCTCCCCTGT GTTTGATTCA 3840 
AAAGGTTCAT TGGTGGCCAT GCATGCTGCT GGCTTTGCTT ATACTTACCA AAATGAGACT 3900 
OGTAGTATCA TTGAGTTTGG CTCTACCATG GAATCCATCC TCCTTGATAT TAAGCAAAGA 3960 
CATAAAGCAT GGTATG AAGA AGTATTTGTA AATCAGCAGG ATGTAGAAAT GATGAGTGAT 4020 
GAGGACTTGXfiAGAATTCAG TCTACTGGAT TTAAGGGAAT GGCTTATGGA GTTGTTATTT 4080 
CGTAGGCATT GAAAATGGTT TTCTAAACTC CAAAATGGTC ATCTTATCAA TAATAATAAT 4140 
ATTGACCATT TCCTATCTGC CAGGCATTTT TCTAAGCACA TGAAG AAATT AGTCCTAACA 4200 
ACACTATGAG ATGGACTATA ACTTGCCCAA All 11 11 11 1 TTTTTGAGAC TGAGTCTCAC 4260 
TCTGTCGCCT GGGCTOGAGT ACAGTGGTGC GATCTCAGCT C^CTGCAACTTCCACCTCCC 4320 
AGGTTCAAGC GATTCTTATG CCTCAGTCTC CTG AGCAGCT GGGATTACAG GCAAACGOCA 4380 
CCACACCCAG CTAAATTTTI ITITITTTTI IGTA TTTTTA GTAGAGACAG GGTTTCAOCA 4440 
TGTTGGTCAG GCGGGTCTCG AACTCCTGAC CTCGTG ATCC ACCTGCCTCG GCCTTCCAAA 4500 
GTGCTGGG AT TACAAGTTTG AGCCACTGCA CCTGGCTAAC TTGOCCTATT TTAAAGTCAA 4560 
GCAATGGGAA GAATAACAAG ATTATATAGT AATCAGTTTC ATGACACTAA AAGTCATATA 4620 
GTCATAGGGT TTTTTCATCT TTCATATCTT TGCCTAAATT CATTTGCTAC AGTGCAGG AA 4680 
CCAAAACTTG TTCATCTCAT GATTCCCTAC ATCTGACATA AGGAAAGTAA GTGCTCAGAA 4740 
AAATGTGCAG GTCAATAAGT TGCAAAAGTT GGGGCTGCAA TTAATGCTAA CATAAGAGCT 4800 
AAATGCTTG A TTAG AAATG A TCTCAAAACC TTTTAGAATT TCC AAAATCT TCATATTACT 4860 
GAAACTGTCG GAATATATGG GTCCTGAAAT TCAGAAGATG ATAGTCACTC TTCCCATATT 4920 
TATAGGCTAT TAAGGCAAGG GATATCTTAA ACATCATATT ACTTTATTTA G ATTTCTACT 4980 
ACTCCAATTA TTAATGTTAT GTATTTCTCA TTOTTTTACT TCTTCATGGT ATTATGAAG A 5040 
CTATA TAGA T GATTCAAOCA AGCCTGCAAA TCTCOCTCIT GTGGAATTCC ACTGGACOCA 5100 
ATCTGTTTTC CATTTOCATT GCAATACTAC TAAAGCCATA CAATATCAAG CACCCTCCCT 5160 
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CTAGGTCCAG GGACTATCAC AGAAGAAGCA GGCATGTAAG ATTTTAAGGA CTGGTTTCGA 5220 
GGGGTCGAGT GTAGG AAAAC AGCCTGTTGC ATTGTAAGAG TGATGTCAOC TTGAAGAGCA 5280 
GCTGGCATGA TGACTGCTGTTTGACTCCTG CATACCAAGA TATTCTGCAG CAATGTCTTT 5340 
_. AAACAGTGCC GGTAGTACAG ATAACCCCTC ATAAAGATGC TTATCTAAOC TCCCCAGTGT 5400 
5 TCAGGTGTTT CACAAG AAAG TCTGAGATAT G ACT AGCT AC ACGTTTTGCC AAAAATGCTT 5460 
GTTATATAAA GGGTACTTTT GGG AGGGTOA GTGCCGCCAT TTAGTGGCTG CTAGAAACAT 5520 
TQ CnCTQTT TGTAAGTTCC TATTAAATOT TCTTTCTGAG AAAAAAAAAA A 

10 SEQ IP PBM4 Pnrtfln ?CTU^ 
PBM4 Protein sequence BAB67788 

MDTVMKQTHA DTPVDHOLSG IRKCSSTFKL KSEVNKHETA LEMQNPNLNN KEOCFTFTLN 60 
, GNSRKLDRSV FTAYGKPSES IYSALSANDY FSERIKNQFN KNUVYEEKTIDGHINLGMP 120 

15 LKCLPSDSHF KTTPGQRKSS KEDGHILRQC ENPNMEC2LF HWAIGRTRK KTVKINELHB ISO 

KGSKLCJYAL KGETEEGALC KDGRFRSDIG EFEWKLKEGH KKIYGKQSMV DEVSGKVLEM 240 
DBKKKALQQ KDWKKIKQN ESATDEINHQ SLIQSKKKVH KPKKDGETKD VHHSREQILP 300 
PQDLSHYIKD KTRQT1PRIR NYYFCSLPRK YRQINSQVRR RPHLGRRYAI NLDVQKEAIN 360 
„ LLKNYQTLNB A1MHQ YPNFK EEAQWVRKYF REEQKRMNLS PAKQFNTYKK DPGKMT ANSV 420 
20 SVATCEQLTY YSKSVGFMQW DNNGNTGNAT CFVWGGYIF TCRHWHLMV GKNTHPSLWP 480 
DUSKCAKVT FTYTEFCPTP DNWFSIEPWL KVSNENLDYA ILKLKENGNA EPPGLWRQ1S 540 
PQPSTGLIYL IGHPEGQKK DX3CTVIPLN ERLKKYPNDC QDGLVDLYDT TSNVYCMFTQ 600 
RSFLSEVWNT HTLSYDTCFS DGSSGSPVFN ASGKLVALHT FGLfYQRGFN VHAUEFGYS 660 
„ MDSILCDIKK TNESLYKSLN DEKLBTYDEB KARPRPAYRR LGCFRFRSRFPILGTGETGR 720 
25 IEAGKDRRGH GVSETGSCSR RQGGALWVSP AQPIGFRSSW SSGAFASSNT SGNCVERWIP 780 
GRVLARRAVS KEQQNNCSTS LMRMESRGDP RATTNTQAQR FHSPKKNPED QTMPQMRTIY 840 
VTLKA VRKEI ETHQGQEMLV RGTEGKEYI NLGMPLSCFP EGGQ WTTFS QSKSKQKEDN 900 
HIFGRQDKAS TECVKFYIHA IGIGKCKRRI VKCGKLHKKG RKLCVYAFKG ETIKDALCKD 960 
GRFLSFLEND DWKLIENNDT ILESTQPVDE LEGRYFQVEV EK RMVPSA AA SQNPESEKRN 1020 
30 TCVLREQIVA QYPSLKRESE KHENFKKKM KVKNGBTLFB LHRTTPGKVT KNSSSBCWK 1080 
LLVRLSDSVG YLFWDS ATTG YATCFVFKGL FILTCRHVID SIVGDGIEPS KWATUGQCV 1140 
RVTPGYEELK DKETNYFFVE PWFEIHNEEL DYAVLKLKEN GQQVPMELYN GITPVPLSGL 1200 
gfflGHPYGE KKQIDACAVI PQGQRAKKCQ ERVQSKKAES PEYVHMYTQR SFQKTVHNTD 1260 
„ VrrYDTEFFF GASGSPVFDS KGSLVAMHAA GFAYTYQNET RSDERjSTM ESILLDIKQR 1320 
35 HKPWYEEVFV NQQDVEMMSD EDL 

SEQ TO NOflBO PBQ1 DMA sequence 
Nucleic Add Accession* NMJM5642 
40 Cooing sequence: 489-2489 (undefined sequence corresponds to start and stop codon) 



45 
50 



1 11 21 31 41 51 

I I I I I I 

ACATTTCAAA AAAAATACAT AGACTGATGT TTCAGACTTG TGCAGCATAA GCCTACAGGG 60 

TACGAAGAAT GAACTCTGAG AATGTTTGGA GAATGTTTCA TCATTACTAA CAGGATATTC 120 

CTCATGACAT TGCTGTCTGA TCTTTGACCA TCAGTCTGTG ACCTGOCCCT TCTCTTTACA 180 

TOCAOCCGCT CTCTOCTCCC TGCCCCAATG AACATCTGCA CTAGGOCCAA GCCTTGGAGT 240 

AATTTACCTG AAGAGTGACA OCATTGATTT TGAAACTACT GAAGAAACCC AAGACAGCTG 300 

AAAACCAGAA GGCATCTGAG GAGAATGAGA TTACTCAGCC GGGTGGATCC AGCGCCAAGC 360 

CGGGCCTTCC CTGCCTGAAC TTTGAAGCTG ITiU Vl l' T OC AGACCCAGCC CTCATCCACT 420 

CAACACATTC ACTGACAAAC TCTCACGCTC ACACCGGGTC ATCTGATTGT GACATCAGTT 480 

GCAAGGG GAT GA CCGAGCGC ATTCACAGCA TCAACCTTCA CAACTTCAGC AATTCCGTGC 540 

- _, TCGAGACCCT CAACGAGCAG CGCAACCGTG GCCACTTCTQ TGACGTAACG GTGCGCATCC 600 
55 ACGGGAGCAT GCTGCGCGCA CACCGCTGCO TGCTG GCAOC CGGCAGCCCC TTCTTCCAGG 660 

ACAAACTGCT GCTTGGCTAC AGCGACATCG AGATCCCOTC GGTGGTGTCA GTGCAGTCAG 720 

TGCAAAAGCT CATTGACTTC ATGTACAGCG GCGTGCTACG GGTCTCGCAG TCGGAAGCTC 780 

TGCAGATCCT CACGGCCGCC AGCATCCTGC AGATCAAAAC AGTCATCGAC GAGTGCACGC 840 

GCATCGTQTC ACAGAACGTG GGCGATGTGT TCCCGGGGAT CCAGGACTCG GGCCAGGACA 900 

00 CGCCGCGGGG CACTCCCGAG TCAGGCACGT CAGGCCAGAG CAGCGACACG GAGTCGGGCT 960 

ACCTGCAGAG CCACCCACAG CACAGCGTGG ACAGGATCTA CTCGGCACTC TACGCGTGCT 1020 

CCATGCAGAA TGGCAGCGGC GAGCGCTCTT TTTACAGCGG CGCAGTGGTC AGCCACCACG 1080 

AGACTGCGCT CGGCCTCCCC CGCGACCACC ACATGGAAGA CCCCAGCTGG ATCACACGCA 1140 

s c TCCATGAGCG CTCGCAGCAG ATGGAGCGCT ACCTGTCCAC CACCCCCGAG ACCACGCACT 1200 

OJ GCCGCAAGCA GCCCCGGCCT GTGCGCATCC AGACCCTAGT GGGCAACATC CACATCAAGC 1260 

AGGAGATGGA GGACGATTAC GACTACTACG GGCAGCAAAG GGTGCAGATC CTGGAACGCA 1320 

ACGAATCCGA GGAGTGCACG GAAGACACAG ACCAGGCCGA GGGCACCGAG AGTGAGCCCA 1380 

AAGGTGAAAG CTTCGACTCG GGCGTCAGCT CCTCCATAGG CACCGAGCCT GACTCGGTGG 1440 

AGCAGCAGTT TGGGCCTGGG GCGGCGCGGG ACAGCCAGGC TGAACCCACC CAACCCGAGC 1500 

70 AGGCTGCAGA AGCCCCCGCT GAGGGTGGTC CGCAGACAAA CCAGCTAGAA ACAGGTGCTT 1560 

CCTCTCCGGA GAGAAGCAAT GAAGTGGAGA TGGACAGCAC TGTTATCACT GTCAGCAACA 1620 

GCTCCGACAA GAGCGTCCTA CAACAGCCTT CGGTCAACAC GTCCATCGGG CAGCCATTGC 1680 

CAAGTACCCA GCTCTACTTA CGCCAGACAG AAACCCTCAC CAGCAACCTG AGGATGCCTC 1740 

— - TGACCTTGAC CAG CAACAC G CAGGTCATTG GCACAGCTGQ CAACACCTAC CTGCCAGCCC 1800 
75 TCTTCACTAC CCAGCCCGCG GGCAGTGGCC CCAAGCCTTT CCTCTTCAGC CTGCCACAGC 1860 

OCCTGGCAGG CCAGCAGACC CAGTTTGTGA CAGTGTCCCA GCCCGGTCTG TCGACCTTTA 1920 

CTGCACAGCT GCCAGCGCCA CAGCCCCTGG CCTCATCCGC AGGCCACAGC ACAGCCAGTG 1980 

GGCAAGGCGA AAAAAAGCCT TATGAGTGCA CTCTCTGCAA CAAGACTTTC ACCGCCAAAC 2040 

AGAACTACGT CAAGCACATG TTCGTACACA CAGGTGAGAA GCCCCACCAA TGCAGCATCT 2100 

80 GTTGGCGCTC CTTCTCCTTA AAGGATTACC TTATCAAGCA CATGGTGACA CACACAGGAG 2160 
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TGAGG6CATA CCAGTGTAGT ATCTGCAACA AGCGCTTCAC CCAGAAGAGC TCCCTCAACG 2220 

TGCACATGCG CCTCCACCGG GGAGAGAAGT CCTACGAGTG CTACATCTGC AAAAAGAAGT 2280 

TCTCTCACAA GACCCTCCTG GAGCGACACG TGGCCCTGCA CAGTGCCAGC AATGGGACCC 2340 

C C CCT BC AOg CACACCCCCA GGTGCCCGCG CTGGCCCCCC AGGCGTGGTG GCCTGCACGG 2400 

AGGGGACCAC TTACGTCTGC TCCGTCTGCC CAGCAAAGTT TGACCAAATC GAGCAGTTCA 2460 

ACGACCACAT GAGGATGCAT GTGTCTCAOG GATAAGTAGT ATCTTTCTCT CTTTCTTATG 2520 

AACAAAACAA AACAACAACA AAAAACAAAC AAACAAAAAA GCTATGGCAC TAGAATTTAA 2580 

GAAATGTTTT GGTTTCATTT TTACTTTCTO T WSWm * TQTTTCGTTT CATTTTGTAC 2640 

TACATGAAGA ACTGTTTTTT GCCTGCTGGT ACATTACATT TCCGGAGGCT TGGGTGAATA 2700 

ATAGTTTTCC CAGTCTCCCT CGGATGGTGG CCTTAAGGCC TGGTAGTGCT TCAAGAGGTC 2760 

CACTGGTTGG ATCTCTAGCT ACTGGCCTCT AAATACAACC CTTCTTTACA AAAAAAAAAA 2820 
AAAAAAAAA 



PBQ1 Protein sequence NP.056457 

MTERIHSINL HNFSNS VLET LNEQRNRGHF CDVTVRIHGS MLRAHRCVLA AGSPFFQDKL 60 
LLGYSDIEIP SWSVQSVQK UDHHYSGVL RVSQSEALQI LTAASILQDC TVIDECTRIV 120 
SQNVGDVFPG IQDSGQDTPR OTFESOTSGQ SSDTHSOYLQ SHFQHSVDRI YSALYACSMQ 180 
NGSGERSFYS GA WSHHETA LGLPRDHHME DPSWTTRIHE R5QQMERYLS TTPEITHCRK 240 
QPRPVRIQTL VGNIHIKQEM EDDYDYYGQQ R VQHJERNES EECTEDTDQA EGTESEPKGE 300 
SFDSGVSSSI GTEFDSVEQQ FGFGAARDSQ AEPTQPEQAA EAPAEGGPQfT NQLETGA5SP 360 
ER5NEVEMDS TVTTVSNSSD KS VLQQPSVN TSIGQPLPST QLYLRQTETL TSNLRMPLTL 420 
TSNTQVIGTA GNTYLPALFT TQPAGSGPKP FLFSLPQPLA GQQTQFVTVS QPOLSTFTAQ 480 
LPAPQPLASS AGHSTASGQG EKKPYBCTLC NKTFTAKQNY VKHMFVHTGB KPHQCSICWR 540 
SFSLKDYUK HMVTHTGVRA YQCSICNKRF TQKSSLNVHM RLHRGEKSYE CYICKKKFSH 600 
KTLLERHVAL HS ASNGTPPA GTTPGARAGP PGWACTEGT TYVCSVCPAK FDQIEQFNDH 660 
MRMHVSDG 



SEP ID HO: 262 PBQ6 DNAseouence 
Nucleic Acid Accession* AJ 654 187 

Cooing sequence: 1412 (underlined sequence corresponds to start and slop codon) 

1 11 21 31 41 51 

I I I I I I 

ATG0T8GAAQ AGGAAACAGG CATATCTTAC ATGGTGGCAG ACAAGGGACA CCCTTCTACA 60 

AACTCTACCA CTTCTGCGOC GTCGTTTCGA CCATATAAAA ACGACCTATG CGAACTGOGT 120 

CGGAAAACTC CCTCACGATG TAAAAOGAAG ATCAGGAGCA GATTTGAAGA ATTACAAAGT 180 

GAATTGGTGC CAGTCAGCAT GTCAGAGACA GACCACATAG CCTCTACTTC CTCTGATAAA 240 

AATGTTGGGA AAACAOCTGA ATTAAAGGAA GACTCATGCA ACTTSTTTTC TGGCAATGAA 300 

AGCAGCAAAT TAGAAAATGA GTCCAAACTA TTGTCATTAA ACACTGATAA AACTTTATGT 360 

CAACCTAATG AGCATAATAA TCGAATTGAA GCCCAGGAAA ATTATATTCC AGATCATGGT 420 

GGAGGTGAGG ATTCTTGTGC CAAAACAGAC ACAGGCTCAG AAAATTCTGA ACAAATAGCT 480 

AATTTTCCTA GTGGAAATTT TGCTAAACAT ATTTCAAAAA CAAATGAAAC AGAACAGAAA 540 

GTAACACAAA TATTGGTGGA ATTAAGGTCA TCTACATTTC CAGAATCAGC TAATGAAAAG 600 

ACTTATTCAG AAAGCCCCTA TGATACAGAC TGCACCAAGA AATTTATTTC AAAAATAAAG 660 

AGCGTTTCAG CATCAGAGGA TTTGTTGGAA GAAATAGAAT CTGAGCTCTT ATCTACGGAG 720 

TTTGCAGAAC ATCGAGTACC AAATGGAATG AATAAGGGAG AACATGCATT AGTTCTGTTT 780 

GAAAAGTGTG TGCAAGATAA ATATTTGCAG CAGGAACATA TCATAAAAAA GGCCAGACTT 840 

GGTCTCTGTT ATTTGCCATC AAGAACCTCA ATTGACACGT TAATTCCGTT TATCCCAAAT 900 
TTATATAGAT AA 



IP PPQ6 Prolan mnm 

Protein Accession «: NP.060170 

MEPKEATGKE NMVTKKKNLA FLRSRLYMLE RRKTDTWES SVSGDHSGTL RRSQSDRTEY 60 
NQKLQBKMTP QGBCS VAETL TPEEEHHMKR MMAKREKHK EUQTEKD YL NDLELCVREV 120 
VQPLRNKKTD RLDVDSLFSN 1ESVHQBAK LLSLLEEATT DVEPAMQVIG EVFLQKGPL 180 
EDIYKIYCYH HDEAHSILES YEKEEELKEH LSHCIQSLK 



Nucleic Acid Accession* NM_0 14323 
Coding sequence: 662-2725 {undefined sequence corresponds to start end stop codon) 

1 11 21 31 41 51 

I I I I i I 

GGGCCTACTC TGCCGCOGCC GCCGOCCGCC CGCTCCAGCC GCCGCCGCCG CCGCCAOCGC 60 

CCTCCAGGCT CCGGGACCCG GCCCGCGCCA CCGCCCCCGT GCGCGCCCCG OCGCCGCCGC 120 

CTTCGCCTTC GCCTTTTQTT TOCTOCGCTC CGGCGCCCCC GCCCCGGCTC GCGCTTTGCA 180 

GGGGACGCAG CGCGCGCCCC CAGOGGGCCC GGGAAAAGCC GCGGCGCGCG CGCGCGCCTG 240 

CGCGGCGGAC OOCTO CT TC T OCTCOCCGCG TGCGOGTGCC CTTCTTGGCT GCGCGCCGGC 300 

GCCGCCTGGC GGGCGGGAGG GGAGGTGGCA GGCGCGTTTG CAGGAGGGGC GCACCTCTTC 360 

GCTCGCGCAC CCC CC CGGAA GGTAGACCGG GAAGGGGAGG CGGGCGGGCG GAGAGGAGAG 420 

AGTGGCGCGC AGTCCAGOGA GGGCGGGGGT TGGCTATGTG GGGGGTGGTG CACCCCGCAG 480 

TCTAGACAGT CTG A TCCG G G CTGGGGGCGT OTACACTCGG CGCACCTGCG AGACTACAGA 540 

GOCTCGGGCC GGCACGTGTG GGGAGTGTGG ACACGTCTGC TCCGCCCCGC '1TCTCGCTGC 600 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 



TGAGGGGAAG 
CATGGAGCGG 
CAGACACAGC 
CTGCGACGTG 
CGCCTGCAGC 
GGACGGGGGT 
CAGCCGGGAG 
CGCCTACACT 
CAAGTTCCTG 
CGTACAGATC 
CTCGGACTTQ 
TGGCATCGCC 
TGCAGGCCAA 
ACCCCTATCC 
CCTGACTGGC 
TGGGTCCCCA 
GTTCACTGAT 
GCTGGGCTAC 
AGAOOCCGAC 
CGGCAAGATC 
GAAGCCCTAC 
CCATGTGCGG 
AGGCTTCTCC 
GCCTCACAAG 
CCTGGCCTGT 
ATACATGGCA 
TAACCGAGGT 
TCCCCTTCCC 
CTGCGCCAGO 
GAGCTCTGAC 
GAGTGCCAAT 
TGGGGAGAAG 
GAACAAACAC 
CCCTGCCCTT 
GTTTCAGATT 
GCCCATGGGG 
GGGACTGCTG 
AGATTITEAT 
TTCTCCCAAT 
ACTTGGTATO 
GTTTCTTTAA 
ATACOCAAAT 
TTCGTCATCC 
TACAATGCGG 
GAGTGTCCTC 
GAGCCCTGCT 
CCCACCCCAA 
TTCTCTAATT 
TGAAAGCTAT 
TATTAAACTT 
GGAAGAAATA 
TTTCAATGCT 
TCAGTTGTGT 
GACTGTATTA 



GGAGGGGGCG 
GTGAACGACG 
ACGGAGATGC 
CTCTTGCGGG 
GAGTACTTTG 
CCGGCTGATG 
CTGGAGATGC 
TCCCGCATCG 
CTGATGAGGT 
CTGGTACCCC 
GGCTTCCCTT 
GGCAGCATGC 
GCCTCTTTGC 
CCCCAACTGC 
AAGCGAGGCC 
GGGGGCCTGA 
GCCAACCGGC 
ATCGACCTTC 
GGCCCCCGAA 
TTCCGTGATG 
TCCTGCCCTG 
TCCCATGATG 
AGGCCTGATC 
TGTCAGACCT 
CATGAAGACA 
GACCACCTGA 
TTCTCCTCTG 
CAGGTCTCCA 
ACCTATGGCA 
TCCTATGGTG 
GGCTCTTTCT 
AAGTACCCAT 
ATCCAGAAGG 
GGCTCACCTT 
GTTCAGTCGG 
CCTGAAGGGA 
GGAAATGCTG 
TCATTTTTAA 
GGTCTTTAGA 
GGACAGGGGC 
TGGGAAGAAG 
CTATGATATT 
TCCCTTCCCA. 
ATGCCCAACT 
CAAGAGCCCC 
TGGAGGCGAG 
ATTTCAGTTC 
ATTATTATTA 
CCCAGGTGAT 
TGTTTAGATG 
GTTTTATGCA 
GTTGGGAACC 
CACATGTGAG 
AAAATGTTAG 



GGCAGGTGCA 
CTTCGTGCGG 
TGCACAACCT 
TAGGCGACGA 
AGTCGGTGTT 
TAGGGGGCGC 
ACACTATCAG 
TGGTGCGCTT 
CGGTTATCGA 
CTGCCCGCGC 
TGGACATGAC 
AGCCAGAGGA 
CTGTGTTACC 
TGACTTCCCC 
GGGGGCGCCC 
GGGAGGCAGG 
TOCGGCAGCA 
CTCCTCCGAG 
AGAGGAGCCG 
TGTATCATCT 
TGTGTGGGTT 
GGTCCGTGGG 
acttgaacgg 
GCaATGCTTC 
aggtgccctg 
AGAAGCACAG 
CCTOCTACTT 
GGCACCAGGA 
ACAAAGAAGG 
ACCTCTCAGA 
CCTGOGACAT 
GCCCTGAATG 
TGCATGTCCG 
TCTCTCCTCA 
CATTTGCGTC 
AATGAGGCAG 
TGAATGCGGA 
CTGCCOOOCA 
AATAGATTTT 
AGAAAACACT 
CTGGAATTCC 
CTGGGACCTC 
TATCCTTCAA 
GTTTTTAAGG 
CTGAGCTCAG 
CATTTTCACT 
TTACGTGATT 
TTGTTATTAT 
ACAGAGCTCT 
TACCATAATT 
AAATTTTAAA 
AGGAAGGTGG 
CAAGCOCAGG 
TACATTACTC 



GCGGCCGGGC 
CCCGTCTGGC 
GAACCAGCAG 
GAGCTTCOCA 
CAGCGCCCAG 
GACGGCAGCA 
CTCCAAGGTA 
GGAGAGCTTT 
GATCTGCCAG 
CGATATAATG 
CAACGGGGCA 
GGAGGCAGCT 
TGGGGTGGAC 
ATTOCCCAGT 
AAGGAAGGCC 
CATCCTTCCA 
CGAGGCCCAG 
GCTGGGTGAG 
GACCAGGAAG 
TAACCGGCAC 
GCGGTTCAAG 
CAAGCCTTAC 
ACATATCAAG 
TTTTGCCACC 
CCAGGTGTGT 
CGAGGGGCCC 
AAAGGTCCAT 
GCCCATCCTG 
CCAGAAATGC 
TGCCAGCGAC 
GGCAGTCCOC 
TGGGAGCTTC 
GGCTCTCGGG 
GCAGAACATG 
ATCTTTAGTA 
CTGCTGTGTC 
GGGAAGTGAT 
ACCCCACTCC 
CATCTGATAT 
ACATAGGCCT 
TGGTGCTCAA 
AGTGATTTTG 
AAGAACCACA 
AAGCCAGAAG 
CCCTCTGCCT 
GCTAGGACAA 
TTAACCATTC 
TTTTTAGGAC 
TTGTAAACGG 
AACTTGGCTA 
AAATGCCAGT 
GACAGCCGGC 
TTGACCTTGT 
TA 



TAGTGGGAGG 
TGCTACACAT 
CGCAAAAACG 
GCGCACCGOG 
TTGGGCGACG 
CGAGGOGGCG 
TTTGGGGACA 
CCCGAACTCA 
GAAGTCATCA 
CTCTTTOGOC 
GCCTTGGCAG 
CGGGCGGCTG 
CGCTTGCOCA 
G7GGCATCCA 
AACCTGCTGG 
TGCGGTCTAT 
CACOTTGTCA 
AATGGGCTAC 
CAGGTGGCTT 
AAGCTGTOCC 
AGAAAAGACC 
ATCTGCCAGA 
CAGGTGCACA 
CGAGACCGTC 
GGGAAGTACT 
AGCAACTTCT 
GTTAAAACCC 
AATGGGGGAG 
TCACATCAGG 
CTGAAGACGC 
AAAAACAAAA 
TTCCGCTCTA 



TCTCTCCTCG 
GATCCTGAGG 
CCCACGGAAA 
GTTTGGGTTC 
AACTCCTTCT 
TCTGCAGAAA 
CCAAGGCAAA 
TTCTTAGTGA 
GTCCCCTCOC 
CTAGGGTCTC 
CATCCCATGG 
GGAGGGCTCC 
GCTCAGCTGT 
AACATGCTGT 
CAGTTGTAGT 
CAGTCACACA 
GTTGATTGTT 
CTGGTCAGGG 
AGGTAGGGAC 
GATGTGAATT 



GGGCGGCGGC 
ACCAGGTGAG 
GCGGGCGCTT 
CCGTGCTGGC 
GCGGAGCTGC 
OGGCC GGGGG 
TTCTGGACTT 
TGACGGCCGC 
AACAGTCCAA 
COCCTGGGAC 
CCAACAGCAA 
GTGCAGCCAT 
TGGTGGCTGG 
GTGOCCCTCC 
ACTCAATGTT 
GTGGTAAGGT 
CCAGCCTCCA 
CCATCTCTGA 
GTGAGATCTG 
ACTCTGGGGA 
GCATGTCCTA 
GCTGTGGGAA 
CTTCTGAGCG 
TGCGCTCCCA 
TGCGGGCAGC 
GCAGTATCTG 
AOCACGGTGT 
CAGCGTTCCA 
ATCCGATTSA 
CAGAGAAGCA 
TGGAGTCTGA 
AGTCCTACTT 
GGGACCTGGG 
AGTCCTTTGG 
TTGACCAGCA 
CAACCATCTG 
TGTAGCTGAG 
CCACCACCCA 
TATCAATGAG 
ACCAGTCCCA 
COCCAATCCT 
ACTTCTCTAG 
CACCTACTTA 
ACCATGGGGT 
AGACCTTTCT 
TGAGGACACC 
TGGGTTTTAA 
GAATTGCTAC 
TTAGGGTTAG 
TGAAGTCTAT 
AAGTAGGGGG 
ATTGTGTACC 
GATCTGATCA 



660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
252Q 
2580 
2640 
2700 
2760 
2620 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 



seq p Npsss pby? IPfrtein tsgasasg 

Protein Accession ft NP_1 14439 

MERVNDASOG PSGCYTYQVS RHSTEMLHNL NQQRKNGGRF CDVLLRVGDE SFPAHRAVLA 60 
ACSEYFESVF SAQLGDGGAA DGGPAOVGGA TAAPGGGAGG SRELEMHTIS SKVFGDHDF 120 
AYTSRIWRL ESFPELMTAA KFLLMRS VIE ICQEVIKQSN VQILVPPARA DIMLFRPPGY 180 
SDLGFPLDMT NGAALAANSN GIAGSMQPEE EAARAAGAAI AGQASLPVLP GVDRLPMVAG 240 
PLSPQLLTSP FPS VASSAPP LTGKRGRGRP RKANLLDSMF GSPGGLREAG UPOGLOGKV 300 
FTDANRLRQH EAQHGVTSLQ LGYIDLPPPR LGENGLPISE DPDGPRKRSR TRKQVACBIC 360 
GKIFRDVYHL NRHKLSHSGE KPYSCPVCX5L RFKRKDRMS Y HVRSHDGSVG KPYICQSCGK 420 
GFSRPDHLNG HIKQVHTSER PHKCQTCNAS FATRDRLRSH LACHEDKVPC QVCGKYLRAA 480 
YMADHLKKHS EGPSNFCSIC NREGQKCSHQ DPIESSDSYG DLSDASDLKT PEKQSANGSF 540 
SCDMAVPKNK MESDGEKKYP CPECGSFFRS KSYLNKHIQK VHVRALGGFL GDLGPAIjGSP 600 
FSPQQNMSLL ESFGFQIVQS AFASSLVDPE VDQQPMGPEG K 



SEQ ID Nfr266 PBY9 DMA sequence 
NudeicAcMAccesskmi: M012429 
CoOmg sequence: 174-1385 (untMnsd sequence 



corresponds to start and stop codon) 
41 



1 11 21 31 41 51 

I I I I I I 

CCCTACTCCG CCTCTCGGGA TCCTTTAAGA GGCGGGGCTT GGCTGCCAGC TCCGCGGCCC 60 
GGGCAAAAGG CTGGGACTTT ACTOC GG GTG GCGGCGAGGA CGAGTCTGTG CTCCATCAGC 120 

413 
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TGCCGCACCC OCCGCCTCOC GCCCCCAAAC CCCATCCCCG C6GTT8AGOC ACGATGAGCG 180 

GCAQAGTCGG CGATCTGAGC CCCAGGCAGA AGGAGGCATT 66CCAAGTTT CGGGAGAATG 240 

TCCAGGATGT GCT6CCG60C CTGCCGAATC CAGATGACTA TTTTCTCCTO CGTTGGCTCC 300 

GAGCCAGAAG CTTCGACCTO CAGAAGTCGQ AG6CCATGCT CCGGAAGGAT GTGGAQTTCC 360 

GAAAGCAAAA GGACATTGAC AACATCATTA GCT6GCAGCC TCCAGAGGTO ATCCAACAGT 420 

ATCTGTCAGQ GGGTATGTGT GGCTATGACC TGGATGGCTG CCCAGTCTGG TACGACATAA 480 

TTGGACC7CT GGATGCCAAG GGTCTGCTGT TCTCAGCCTC CAAACAGGAC CTGCTGAGGA 540 

CCAAGATGCG GGAGTOTGAG CTGCTTCTGC AA6AGTGT6C CCACCAGACC ACAAAGTT6G 600 

GGAGGAAGGT GGAGACCATC ACCATAATTT ATGACTGCGA 6GG6CTT6GC CTCAAGCATC 660 

TCTGGAAGCC TGCTGTGGAG GCCTATGGAG AGTTTCTCTG CATGTTTGAG GAAAATTATC 720 

CCGAAACACT GAAGCGTCTT TTTGTTGTTA AAGCCCCCAA ACTGTTTCCT GTGGCCTATA 780 

ACCTCATCAA ACCCTTCCTG AGTGAGGACA CTCGTAAGAA GATCATGGTC CTGGGAGCAA 840 

ATTGGAAGGA GGTTTTACTG AAACATATCA GCCCTGACCA GGTGCCTGTG GAGTATGGGG 900 

GCACCATGAC TGACCCTGAT GGAAACCCCA AGTGCAAATC CAAGATCAAC TACGGGGGTG 960 

ACATCCCCAG GAAGTATTAT GTGOGAGACC AGGTGAAACA GCAGTATGAA CACAGCGTGC 1020 

AGATTTCCCG TGGCTOCTCC CACCAAGTGG AG TATO AGAT CCTCTTCCCT GGCTGTCTCC 1080 

TCAGGTGGCA GTTTATGTCA GATGGAGCGG ATGTTGGTIT TGGGATTTTC CTGAAGACCA 1140 

AGATGGGAGA GAGGCAGCGG GCAGGGGAGA TGACAGAGGT GCTGCCCAAC CAGAGGTACA 1200 

ACTOOCACCT GGTCCCTGAA GATGGGACCC TCACCTGCAG TGATCCTGGC ATCTATGTCC 1260 

TGCGGTTTGA CAACACCTAC AGCTTCATTC ATGCCAAGAA GOTCAATTTC ACTGTGGAGG 1320 

TCCTGCTTCC AGACAAAGCC TCAGAAGAGA AGATGAAACA GCTGGGGGCA GGCACOCCGA 1380 

AATAACACCT TCTCCTATAG CAGGCCTGGC CCCCTCAGTG TCTCCCTGTC AATTTCTACC 1440 

CCTTGTAGCA GTCATTTTCG CACAACCCTG AAGCCCAAAG AAACTGGGCT GGAGGACAGA 1500 

CCTCAGGAGC TTTCATTTCA GTTAGGCAGA GGAAGAGCGA CTGCAGTGGG TCTCCGTGTC 1560 

TATCAAATAC CTAAGGAGTC CCCAGGAGCT GGCTGGCCAT CGTGATAGGA TCTGTCTGTC 1620 

CTGTAAACTG TGCCAACTTC ACCTGTCCAG GGACAGCGAA GCTGGGGGTG GCGGGGGGCA 1680 

TGTACCACAG GGTGGCAGCA GGGAAAAAAA TTAGAAAAGG GTGAAAGATT GGGACTTAAC 1740 

ACTTCAGGGA AGTCAGCTGC CGGGGAGAAA C TT OC T CC rA AATGAACACA TAAGTTTAGA 1800 

TCGCAATGAG GAGTAGCAGG GTAGCTGGTT G CTAGAG TTA CGGT GGGGAT CAGAAACTCT 1860 

TCCAAACATT TTAGCACTGA GGCTGGGGTA GCTTTTGGCT TTTCCCAGGT CTCAGGAGGT 1920 

GGCCTGAGTC AGCACACATC TTCCCACTCG GTAGACAGGC TGGCCTCTCC CTCACTTTGA 1960 

GACTTTGGGA ACTCCTGGGC CACACGGCCT GCCTCTTTGA TTACTAATGA TTGTCAGTGA 2040 

CTCAGAGCTT CCTGGGACTT CGGOTACCCA CCCGCTGTTC TCCATCCAAA CAAAGCGCCA 2100 

GGGAAATGAC CCACAGGGAT CGCAGCTGCA GGGAGGGCCA GGGAGGTTGG GGGTGGGAGT 2160 

GAATGCTAAA AGCAGATCGT CCAGTGCCCT TTTCAGTGCT ACOGGCCTCT CACCAAGCAG 2220 

TCCTCCATGT GAGCAACCCC GAGACAAAAA TGCEAAGTGG GATCAAGAGA GCAGCACTCG 2280 

GAGAGGGTGT TTGCCAGTCT GAGTGTCCCG CGGTGCCCGC CAACCCGCTT CCTGACTGAC 2340 

CTGAGCAAGG TCTTACTAAG CAGTOCCATC TCTGTGGGAG GCATGCAACG CGTGCAGGGA 2400 

GTTCAGGTGC CGGTCGGCGT AGCCAGGCCT GGAGGCCOCC CAGGCAGGAG GCOGCCCAAA 2460 

GGCGGGGCOG GCGTCTGGCA GACTAGGGGC TGGGGGCGGC CACAGACGGC C TCGAA AOCA 2520 

CAGCCCTTAC CCCAATCCCA CGAGCCCCGC CAAC6AA0CA CAGGTGCTGG GCTTTAGAGA 2560 

ACATGGGAAG GCGGCCCCAG ACCTGGCGGG AACGCCTTTC CCTCAGAGCC AGGCCCCGGC 2640 

CCCGTCTGGG AAGCTCATCT TGOGAAGCTG AGGGAGCTCA GGGCAAAGGC CAGGCTAGCG 2700 

CGGACCGGAA GGGGCCGAGG CTGCACGGGC CTCTGCCAGA ACGCTCAGGA CATCCCGGCC 2760 
TGGGTTTACA ACGCTGTTAG GAAAATTAAC CAATGAATAA AGCAACGTTC AGTGCGCA 



SEQ ID N03S7 PBY9 Protein sequence: 
Protein Accession*: NP.036561 

MSGRVGDLSP RQKEAXAKFR ENVQDVLPAL PNPDDYFLLR WLRARSFDLQ KSEAMLRKHV 60 
EFRKQKDIDN HSWQPPEVI QQYLSGOMCG YDLDGCPVWY DIIGPLDAKG LLFSASKQDL 120 
LRTKMRECEL LLQECAHQTT KLGRKVETIT IIYDCEGLGL KHLWKPAVEA YGEFLCMFEE 180 
NYPETLKRLF WKAPKLFPV A YNLIKPFLS EDTRKKIMVL GANWKEVLLK HKPDQVPVE 240 
YGGTMTDPDG NPKCKSKINY GGDIPRKYYV RDQVKQQYEH SVQISRGSSH QVEYHLFPG 300 
CVLRWQFMSD GADVGPGIFL KTKMGERQRA GEMTEVLPNQ RYNSHLVPED GTLTCSDPGI 360 
YVLRFDNTYS FTHAKKVNFT VEVLLPDKAS EEKMKQLG AO TPK 



SEQmWO^fiSPBHBDNAsamience 
Nuc^ Add Accession: XM_009756 

Coding sequence: 301-1440 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

GTGGGGACAG CCGAGCCGCG CCGGGCCCCT GGACGGCGTC GCCAAGGAGC TGGGATCGCA 60 

CTTGCTGCAG ACTTTGGATG GATTTGTTTT TGTGGTAGCA TCTGATGGCA AAATCATGTA 120 

TATATCCGAG ACCGCTTCTG TCCATTTAGG CTTATCCCAG GTGGAGCTCA CGGGCAACAG 180 

TATTTATGAA TACATCCATC CTTCTGACCA CGATGAGATG ACCGCTGTCC TCACGGCCCA 240 

OCAGCCGCTG CACCACCACC TGCTCCAAGG TATGAGATAG AGAGGTCGTT CTTTCTTCGA 300 

ATGAAATGTG TCTTGGCGAA AAGGAACGOG GGCCTGACCT GCAGCGGATA CAAGGTCATC 360 

CACTGCAGTG GCTACTTGAA GATCAGGCAG TATATGCTGG ACATGTCCCT GTACGACTCC 420 

TGCTAOCAGA TTGTGGGGCT GGTGGCCGTG GGCCAGTCGC TGCCAGCGAG TGCCATCACC 480 

GAGATCAAGC TGTACAGTAA CATGTTCATG TTCAGGGCGA GCCTTGACCT GAAGCTGATA 540 
TTCCTGGATT CCAGGGTGAC CGAGGTGAOG GGGTACGAGC CGCAGGACCT GATCGAGAAG * 600 

ACCCTATACC ATCACGTGCA CGGCTGCGAC GTGTTCGAOC TOCGCTACGC ACAOCAOCTC 660 

CTGTTGGTGA AGGGCCAGGT CACCACCAAG TACTACCGGC TGCTGTCCAA GOGGGGOGGC 720 

TGGGTGTGGG TGCAGAGCTA CGOCACCGTG GTGCACAACA GCCCCTCGTC CCC GOOOC AC 780 

TGCATCGTGA GTGTCAATTA TGTACTCACG GAGATTGAAT ACAAGGAACT TCAGCTGTCC 840 

CTGGAGCAGG TGTCCACTGC CAAGTCCCAG GACTCCTGGA GGACCGCOT OTCTACCTCA 900 
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CAAGAAACTA GGAAATEAGT GAAACCCAAA AATACCAAOA TGAAGACAAA GCTGAGAACA 960 

AACCCTTACC CCCCACAGCA ATACAGCTCG TTCCAAATGG ACAAACTGGA ATGCGGCCAG 1020 

CTCGGAAACT GGAGAGCCAG TCCCCCTGCA AGCGCTGCTG CTCCTCCAGA ACTGCAGCCC 1080 

CACTCAGAAA GCACTGACCT TCTGTACAOQ CCATCCTACA GCCT6CCCTT CTCCTACCAT 1140 

TACGGACACT TCCCTCTGGA CTCTCACOTC TTCAGCAGCA AAAAGCCAAT GTTGCCGGCC 1200 

AAGTTCGGGC AGCCCCAAGG ATCCCCTTGT GAGGTGGCAC GCTTTTTCCT GAGCACACTG 1260 

CCAGCCAGCG GTCAATGCCA GTGGCATTAT GCCAACCCCC TAGTGCCTAG CAG CT CGTCT 1320 

CCAGCTAAAA ATCCTCCAGA 6CCA0CGGCG AACACTGCTA GGCACAGCCT GGTGCCAAGC 1380 

TACOAAOGCA AGCAGATGTC CTCTGCGGAQ ATACCGCCAQ CTCCOCAGQA CGCAGACTGA 1440 
CTCCTQTTTG CTCGCTGGAC CAAC 



gEQ fP Nftffia PBffi Prolan gw^w; 

Proton Accession ft NP.OQ5060 

MKEKSKNAAK TRREKENGEF YELAKLLPLP SAJTSQLDKA SHRLTTSYL KMRAVFPEGL 60 
GDA WGQPSRA GFLDOV AKEL GSHIXQTLDG FVFW ASDGK IMYISET AS V HLQLSQ VELT 120 
GN5IYEYIHP SDHDEMTAVL TAHQPLHHHL LQEYHERSF FLRMKCVLAK RNAGLTCSOY 180 
KVIHCSGYLK IRQYMLDMSL YDSCYQIVGL VA VGQSLPPS AITHKLYSN MFMFRASLDL 240 
KUFLDSRVT EVTGYEPQDL IEKTL YHHVH GCD VFHLR Y A HHLLLVKGQ V TTKYYRIXSK 300 
RGGWVWVQS Y ATWHNSRSS RPHOVSVNY VLTEIEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQETRKLV KPKNTKMKTK LRTNPYPPQQ YSSFQMDKLE CGQLGNWRAS PPASAAAPPE 420 
LQFHSE5SDL LYTPSYSLPF SYHYGHFPLD SHVFSSKKPM LPAKFGQFQG SFCEVARFFL 480 
STLPASGEOQ WHYANPLVPS SSSPAKNPPB PPANTARHSL VPSYEAPAAA VRRFCEDTAP 540 
PSFPSOGHYR EEPALGPAKA ARQAARDGAR LALARAAPEC CAPPTPEAPG APAQLPFVLL 600 
NYHRVLARRG PLGGAAPAAS GLACAPGGPE AATGALRLRH PSPAATSPPG APLPHYLGAS 660 
VHTNGR 



SEQPH0CTPpV9PNAwwgncg 

Micleic Acid Accession*: AA760894 

GGCAOGAGGA GAAG ATGTGG CTTGCTCATG CTTGACTTCT GOCATGGTTG TGAGGCCICC 60 
OCAGOCATGT G GAACTGTTT TCAGGTGCTG GTTCCATQGC TCTTCCTGAG CCGAAAATAA 120 
GGAAACTCCA TAGAOCTTGT CCACTGG AAC TCGTTCCCAT CTACCCTCCA CTCTATCCAG 180 
GGTGATGGAT CTCTGCAGTA AGTGGAAGAG TTCTTCATGG COCOCAAGGT TATATCCATC 240 
TAGAACTTCA GCACGTAATT TCATCTGGAA ATAGTGCCTT TGTGGATATA AGTTAGGTAA 300 
AACTGAAGAT GAGATCATAC TGGATTAGGA TGGGATCTAA ATOCAATOAA AATGTCTTCA 360 
TAAAAAACAG GAAAGAACCC ATAGAAACAC AAGGAAGAAG GTCATGTGAA GATGGAGGCA 420 
GAGATTGGAG GGATGCAGCC AOCGGCCCAG GAATGCCAGC AGCCACOCAG AAGCTGGAAG 480 
GAAATG AGGG ATTCTCTOCT AGAACCTTTA GAGAGRACAT GGTCCTGTGA ACAGCTTGAT 540 
TTTCGACTTG GCCATAGCTT GTATACTCTT ACTTTGGATA CAATTTTATC CAAACTTGGC 600 
TAAACAGTTT CTCAGGCTAT GG AAAATTTA AAATGGAGAA GATTCAACTC GATTCTTACA 660 
GATTCAAAGC AAGAAAATGA TGGGAACATA GG AGG AGACC AAGAAAGCCT ATAAAAAGCA 720 
AAAATATGAA GTGAACATTG TGGTAGCTTT AAGATGTTTA GTGTAGCTGC AGGCACCCTA 780 
TACACATGAA AACCCCCAAG GGG AATCCCC ATATCACAGT GTAGTGTGAT ATTTGACATT 840 
YGTCATCATY TAGAG ATGTA CAG AAAAGGT G AATCTGTCT TCTGTATATT CTGCCTAAGG 900 
CAAAGAAATG TTTAGCTYTC TTTAAAATAG TTCCATAATT TTTTYTAAAA AGCTTTGCTT 960 
GAAAACTGTA AGCTTOCCAT ATCTGGAGCA TTTCACTTTA AATATTTGGA TAAATATGTT 1020 
A 111 TCI 1 A C TTGGACATTT eATGTGTTTA GGGATTGTYT TYTAAATTCT TCCTAATTCA 1080 
TATAGCTGCT AACACTTOCC GCAG AGCTAA AOCATTACAG ANTATG AAAT AAAG AOOCTA 1140 
TTGATTTG AA CTTAAAAAAA AAAAMAMAAA AAAAAAAAAA AAAAAAAAATGA 

S^|DN9;271PBQ4PNAswwpw 
Njclefc Acid Accession* AA149579 

Coding w quBncc 1*1363 (undefined sequence corresponds to start And stopcodon) 



1 11 21 31 41 SI 

I I I I I I 

ATGG AATCAA TCTCTATGAT GGGAAGCCCT AAGAGCCTTA GTGAAACTTO TTTACCTAAT 60 

GOCATAAATO GTATCAAAGA TGCAAGGAAG GTCACTGTAG GTGTGATTGG AAGT66AGAT 120 

TTTGCCAAAT CCTTGACCAT TCGACTTATT AGATGCGGCT ATCAIGTGGT CATAGGAAGT 180 

AGAAATCCTA AGTTTGCTTC TGAATTTTTT CCTCATGTGG TAGATGTCAC TCATCATGAA 240 

GATGCTCTCA CAAAAACAAA TATAATATTT GTTGCTATAC ACAGAGAACA TTATACCTCC 300 

CTGTGGGACC TGAGACATCT GCTTGTGGGT AAAATCCTGA TTGATGTGAG CAATAACATG 360 

AGGATAAACC AGTACCCAGA ATCCAATGCT GAATATTTGG CTTCATTATT CCCAGATTCT 420 

TTGATTGTCA AAGGATTTAA TGTTGTCTCA GCTTQGGCAC TTCAGTTAGQ ACCTAAGGAT 480 

GCCAGCOGGC AGGTTTATAT ATGCAGCAAC AATATTCAAG CGCGACAACA GGTTATTGAA 540 

CTTGCCOGOC AGTTGAATTT CATTCCCATT GACTTGGGAT CCTTATCATC AGCCAGAGAG 600 

ATTGAAAATT TACCCCTACG ACTCTTTACT CTCTGGAGAG GGCCAGTGGT GGTAGCTATA 660 

AGCTTGGCCA CATTTTTTTT CCTTTATTCC TTTGTCAGAG ATGTGATTCA TCCATATGCT 720 

AGAAACCAAC AGAGTGACTT TTACAAAATT OCTATAGAGA TTGTGAATAA AACCTTACCT 780 

ATAGTTGCCA TTACTTTGCT CTCCCTAGTA TACCTCGCAG GTCTTCTGGC AGCTGCTTAT 840 

CAACTTTATT ACGGCACCAA GTATAGGAGA TTTCCACCTT GGTTGGAAAC CTGGTTACAG 900 

TGTAGAAAAC AGCTTGGATT ACTAAGTTTT TTCTTCGCTA TGGTCCATGT TGCCTACAGC 960 

CTCTGCTTAC CGATGAGAAG GTCAGAGAGA TATTTGTTTC TCAACATGGC TTATCAGCAG 1020 

GTTCATGCAA ATATTGAAAA CTCTTGGAAT GAGGAAGAAG TTTGGAGAAT TGAAATGTAT 1080 

ATCTOCTTTG GC ATAATG AQ CCTK5GCTTA CTriUXTOC TGGCACTCAC TTCTATCCCT 1140 

TCAGTGAGCA ATGCTTTAAA CTGGAGAGAA TTCAGTTTTA TTCAGTCTAC ACTTGGATAT 1200 
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GTCGCTCTGC TCATAAGTAC TTTCCATGTT TTAATTTATG GATGGAAACG AGCTTTTGAG 1260 
GAAGAGTACT ACAGATTTTA TACACCACCA AACTTTGTTC TTQ CT CTTGT TTTGCCCTCA 1320 
ATTGTAATTC TGGATCTTTT GCAGCTTTGC AGATACCCAG ACTGA 

35Q Iff Nftffi PBQ4 Prete?n gewgnre; 
Protein Accession*: none 



1 11 21 31 41 SI 

I I I I I I 

MBSISMKGSP KSLSETCLPH GXNGTKDARK VTVGVIGSGD FAKSLTXRLX RCOYHWIOS 60 

RNPKFASBPF PHWBVTHHB DALTKTNIIF VAIHREHYTS LWDLRHLLVO KILHJVSNNM 120 

RXNQYFBSNA EYLASLFPDS LIVKGFNWS AMALQLGPKD ASRQVY1CSM NIQARQQVIE 180 

LARQLNFIPI DLGSLSSARB IENLPLRLFT LWRGPVWAI SLATPFFLYS FVRWIHFYA 240 

RNQQSDFYKI PIEIVNKTLP XVAITLLSLV YLAGLLAAAY QLYYGTKYRR PPPWLETWLQ 300 

CRKQLGLLSF FFAMVHVAYS LCLPMRRSER YLFLNMAYQQ VHANIENSWN EEEVWRIEMY 360 

ISFGZHSLGL LSLLAVTSIP SVSNALNWRE FSFIQSTLGY VALLISTFHV LIYGWKRAFE 420 
EEYYRFYTPP NFVLALVLPS IVILDLLQLC RYPD 

SEQ ID N&273 PBQ5 DMA SEQUENCE 

Nucleic Acid Accession*: NM.0Q1973 

Coding sequence: 150-1445 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I I I 

CCGCCGCCTT CTACTCCGCC GC G GGGGTCG CACCGGCTGC OGCGCCGTCC TCGAGTTTCC 60 

AGCGTGAGGA GGAGGCTGAG GGC6GAGAGQ CGCATCGTGT TCGAGGCGGA GACCGAGGGG 120 

GAGCCCCGCG CGCGGCGTCG CTCATTGCTA TGG ACAGTGC TATCACCCTG TGGCAGTTCC 180 

TTCTTCAGCT CC7GCAGAAG CCTCAGAACA AGCACATGAT CTGTTGGACC TCTAATGATG 240 

GGCAGTTTAA GCTTTTGCAG GCAGAAGAGG TGOCTCGTCT CTGGGGGATT CGCAAGAACA 300 

AGCCTAACAT GAATTATGAC AAACTCAGCC 6AGCCCTCA6 ATACTATTAT GTAAAGAATA 360 

TCATCAAAAA AGTQAATGGT CAGAAGTTTG TGTACAAGTT TOTCTCTTAT CCAGAGATTT 420 

TGAACATGGA TCCAATQACA GTGGGCA66A TTGAGGGTGA CTGTGAAAGT TTAAACTTCA 480 

GTGAAGTCAG CAGCAGTTCC AAAGATGTGG AGAAT66A6G GAAAGATAAA CCACCTCAGC 540 

CTGGTGCCAA GACCTCTAGC CGCAATGACT ACATACACTC TGGCTTATAT TCTTCATTTA 600 

CTCTCAACTC TTTGAACTCC TCCAATGTAA A6CTTTTCAA ATTGATAAAO ACTGAGAATC 660 

CAGCCGAQAA ACTGGCAGAG AAAAAATCTC CTCA6GA6CC CACACCATCT GTCATCAAAT 720 

TTGTCACGAC ACCTTCCAAA AAGCCACCAG TTGAACCTGT TGCTGCCACC ATTTCAATTG 780 

GCCCAAGTAT TTCTCCATCT TCAGAAGAAA CTATCCAAGC TTTGGAGACA TTGGTTTCCC '840 

CAAAACTGCC TTCCCTOGAA GCCCCAACCT CTGCCTCTAA CGTAATOACT GCTTTTGCCA 900 

CCACACCACC CATTTCGTCC ATACCCCCTT T6CAGGAACC TCCCAGAACA CCTTCACCAC 960 

CACTGAGTTC TCACCCAGAC ATCGACACAG ACATTGATTC AGTGGCTTCT CAGCCAATGG 1020 

AACTTCCAGA GAATTTGTCT CTGGAGCCTA AAGACCAGGA TTCAGTCTTQ CTAGAAAAGG 1080 

ACAAAGTAAA TAATTCATCA AGATCCAAGA AACCCAAAGG GTTAGGACTG GCACCCACCC 1140 

TTGTGATCAC GAGCAGTGAT CCAAGCCCAC TGGGAATACT GAGCOCATCT CTCCCTACAG 1200 

CTTCTCTTAC ACCAGCATTT TTTTCACAGA CACCCATCAT ACTGACTCCA AGOOCCTTGC 1260 

TCTCCAGTAT CCACTTCTGG AGTACTCTCA GTCCTGTTGC TCCCCTAAGT CCAGCCAGAC 1320 

TGCAAGGTGC TAACACACTT TTCCAGTTTC CTTCTGTACT GAACAGTCAT GGGCCATTCA 1380 

CTCTGTCTGG GCTGGATGGA CCTTCCACCC CTGGCCCATT TTCCCCAGAC CTACAGAAGA 1440 

CATAACCTAT GCACTTGTGG AATGAGAGAA CCGAGGAACG AAGAAACAGA CATTCAACAT 1500 

GATTGCATTT GAAGTGAGCA ATTGATAGTT CTACAATGCT GATAATAGAC TATTGTGATT 1560 

TTTGCCATTC CCCATTGAAA ACATCTTTTT AGGATTCTCT TTGAATAGGA CTCAAGTTGG 1620 

ACTATATGTA TAAAAATGCC TTAATTGGAG TCTAAACTCC ACCTCCCTCT G T CTT TTCC T 1680 

TTTCTTTTTC TTTCCTTCCT T OC T T TT C TT CTCTCCTT TA AAAATATTTT GAGCTTTGTG 1740 

CTGAAGAAGT TTTTGGTGGG CTTTAGTGAC TGTGCTTTGC AAAAGCAATT AAGAACAAAG 1800 

TTACTCCTTC TGGCTATTGG GACCCTTTGG CCAGGAAAAA TTATCCTTAG AATCTATTAT 1860 

TTAAAGAAGT ATTTGTGAAA TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1920 
AAAAAAAAAA AAA 



SEQ ID NO:274 PBQ 5 Protein seouencg 
Pioteto Accession* NP_001964 

MDSAITLWQF LLQLLQKPQN KHM1CWTSND GQFKLLQAEE VARLWGIRKN KPNMNYDKLS 60 
RALRYYYVKN HKKVNGQKF VYKFVSYPEI LNMDPMTVGR IEGDCESLNF SEVSSSSKDV 120 
ENGGKDKPPQ PGAKTSSRND YIHSGLYSSF TLNSLNSSNY KLFKUKTEN PAEKLAEKKS 180 
PQEPTPSVIK FVTTPSKKPP VEPVAATISI GPSBPSSEE TIQALETLVS PKLPS LEAPT 240 
SA5NVMTAFA TTPPISSIPP LQEPPRTPSP PLSSHPDIDT DIDSVASQPM ELPENLSLEP 300 
KDQDSVLLEK DKVNNSSRSK KPKGLGLAPT LVITSSDPSP LGILSPSLPT ASLTPAFFSQ 360 
TPHLTPSPL LSSIHFWSTL SPV APLSPAR LQGANTLFQF PSVLNSHGPF TLSGLDGPST 420 
PGPFSFDLQKT 



SEQ 10 Nfc275 PBY3 DNA SEQUENCE 

Nucleic Add Accession*: AB040921 

Coding satjuence: 131-2560 (undaOnadsKiUBna corresponds to start and stopcodon) 
1 11 21 31 41 51 

I I I I I I 

416 
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AATCAGGAAC AQATCATATA TTGACCGAGA TTCTGAQTAT CTCTTGCAAG AAAATGAACC 60 

AGATGGAACT TTAGACCAAA AATTATTGGA AGATTTACAA AAGAAAAAAA ATGACCTTC6 120 

GTATATTGAA ATGCAGCATT TCAGAGAAAA GCTGOCTTCG TATGGAATGC AAAAGGAATT 180 

GGTAAATTTA ATTGATAfi.CC ATCAGGTAAC AGTAATAAGT GGTGAAACTG GTT6TGGCAA 240 

AACCACTCAA GTTACTCAGT TCATTTTGGA TAACTACATT GAAAGAGGAA AAGGATCTGC 300 

TTGCAGAATA GTTTGTACTC AGCCAAGAAG AATTAQTOOC ATTTCAGTTG CGGAAAGAGT 360 

AGCTCCAGAA AGGGCAGAAT CTTGTGGCAG T6CTAATAGT ACTGGATATC AAATTCGTCT 420 

CCAGAGTCGG TTGCCAAGGA AACAGGGTTC TATCTTATAC TGTACAACAG GAATCATCCT 480 

TCAGTGGCTC CAGTCAGACC CGTATTTGTC CAGTGTTAGT CATATCGTAC TTQATGfiAAT 540 

CCATGAAAGA AATCTGCAGT CAQATGTTTT AATQACTGTT GTTAAAQACC TTCTCAATTT 600 

TCGATCTGAC TTGAAAGTAA TATTGATGAQ TGCAACATTG AATGCAOAAA AGTTTTCAGA 660 

ATATTTTGGT AACTGTCCAA TGATACATAT AOCTGGTTTT ACOTTCCGG TTGTGGAATA 720 

TCTTTTGGAA GATGTAATTQ AAAAAATAAG GTAUVl'l UJ A 6AACAAAAA6 AACACAGATC 780 

CCAGTTTAAO AGGGGTTTCA TGCAAGGGCA TGTAAATAGA CAAGAAAAAG AAGAAAAAGA 840 

AGCAATATAT AAAGAACGTT GGCCAGATTA TGTAAGGGAA CTGCGAAGAA GQTATTCTGC 900 

AAGTACTGTA GATGTTATAG AAATGATGGA GGATGATAAA GTTGATCTGA ATTTGATTGT 960 

TGCCCTCATC CGATACATTG TTTTGGAAGA AQAOGATGGT GCQATACTQQ TCTTTCTGCC 1020 

AGGCTGGGAC AATATCAGCA CTTTACATGA TCTCTTQATG TCACAAGTAA TGTTTAAATC 1080 

AGATAAATTT TTAATTATAC CTTTACATTC ACTGATGCCT ACAGTTAACC AGACACAGGT 1140 

GTTTAAAAGA ACCCCTOCTG GTGTTCGGAA AATAGTAATT GCTACCAACA TTGCGGAGAC 1200 

TAGCATTACC ATAGATGATG TCGTTTATGT GATAGATGGA GGAAAAATAA AAGAGACGCA 1260 

TTTTGATACT CAOAACAATA TCAGTACAAT GTCCGCTGAG TGGGTTAGTA AAGCTAATOC 1320 

CAAACAGAGA AAAGGTCGAG CTGGAAGAGT TCAACCTGGT CATTGCTATC ATCTGTATAA 1380 

TGGTCTTAGA GCAAGTCTTC TAGATGACTA TCAACTGCCA GAAATTTTGA GAACTCCTTT 1440 

GGAAGAACTT TGTTTACAAA TAAAGATTTT AAGGCTAGGT GGAATTGCTF ATTTTCTGAQ 1500 

TA3ATTAATQ GACCCACCAT CAAATGAGQC AGTGTTACTC TCCATAAGAC ACCTGATGGA 1560 

GCTGAACGCT TTGGATAAAC AAGAAGAATT GACACCTCTT GGAGTCCACT TGGCACGATT 1620 

ACCCGTTGAG CCACATATTG GAAAAATGAT TCTTTTTGGA GCACTGTTCT GCTGCTTAGA 1680 

OCCAGTACTC ACTATTGCTG CTAGTCTCAG TTTCAAAGAT CCATTTGTCA TTCCACTGGG 1740 

AAAAGAAAAG ATTGCAGATO CAAGAAGAAA GGAATTGGCA AAGGATACTA 6AAGTGATCA 1800 

CTTAACAGTT GTGAATGCGT TTGAGGGCTG GGAAGAGGCT AGGCGACGTG GTTTCAGATA 1860 

CGAAAAGGAC TATTGCTGGG AATATTTTCT GTCTTCAAAC ACACTGCAGA TGCTGCATAA 1920 

CATGAAAGGA CAGTTTGCTG AGCATCTTCT TGGAGCTGGA TTTGTAAGCA GTAGAAATCC 1980 

TAAAGATCCA GAATCTAATA TAAATTCAGA TAATGAGAAG ATAATTAAAG CTGTCATCTG 2040 

T GC TG GT T TA TATCCCAAAG TTGCTAAAAT TCGACTAAAT TTGGGTAAAA AAAGAAAAAT 2100 

GGTAAAAGTT TACACAAAAA CCGATGGCCT GGTTGCTGTT CATCCTAAAT CTGTTAATGT 2160 

GGAGCAAACA GACTTTCACT ACAACTGGCT TATCTATCAC CTAAAGATGA GAACAAGCAG 2220 

TATATACTTG TATGACTGCA CAGAGGTTTC CCCATACTGT CTOlTgrm ' TTGGAGGTGA 2280 

CATTTCCATC CAGAAGGATA ACGATCAGGA AACTATTGCT GTAGATGAGT GGATTGTATT 2340 

TCAGTCTCCA GCAAGAATTG CCCATCTTGT TAAGGAATTA AGAAAGGAAC TAGATATTCT 2400 

TCTGCAAGAG AAGATTGAAA GTCCTCATCC TGTAGACTGG AATGACACTA AATCCAGAGA 2460 

CTGTGCAGTA CTGTCAGCTA TTATAGACTT GATCAAAACA CAGGAAAAGG CAACTCCCAG 2520 

GAACTTTCCG CCACGATTCC AGGATGGATA TTACAGCTGA CAGCTTTTCA GGGGTGGTCT 2580 

GAAAAGCCAG TTTGACAGCC ATTCTTCATC ATTGTTTAAA TTTTGGCTGG ATGCCAAACC 2640 

CTGGGACATG AACAATTTTC ATGTGTAAGG TAGAAGCCTT CAGTAGGTAG TAAAGACTTA 2700 

ATGTGCATGA CTTGATGTTA TATGTAGAGA TATATATATA TATATATATA CCATAAAAGC 2760 

AATATGTTCT CTGATCATAT ACTCTCCTGT GGTCATGCCC ACTCTTTGGG AGTATATTCC 2820 

CTTTATATAT ATTGAGTATT GTACCACTTG AGAAATTCCT TTGTTCTGTT ATACAAAATT 2880 

AATCTTTCTG CTCATAATGA TTGATGATAC CACCAGTAAA AATAGGATGT TTACCCCAAA 2940 

ACAAGTGTCA ATTAAGAATT TGAACACAAC CACATTTTTT AAAATGAAAC TTCTATCGGA 3000 
AGTAAATTAA TTTGTTGTAA TAAAGTCCAG TATTTAATAA AATGTACAAT GTTAAATCTC 

SEQ IP HOCT PBY8 PrcWn sewgnrc. 

Protein Accession #: BAA96012 

IRNRSYIDRD SEYLLQENEP DGTLDQKLLE DLQKKKNDLR YIEMQHFREK LPSYGMQKEL GO 
VNUDNHQVT V1SGETGCGK TTQVTQFILD NYIERGKGS A CRIVCTQPRR KAISVAERV 120 
AAERAESOGS GNSTGYQIRL QSRLPRKQGS ILYCTTOIIL QWLQSDPYLS SVSHEVLDQ 180 
HERNLQSDVL MTWKDLLNF RSDLKVUMS ATLNAEKFSE YFGNCPMIHI PGFTFPWEY 240 
LLED VIEKIR YVPEQKEHRS QFKRGFMQGH VNRQEKEEKE AIYKERWPDY VRELRRRYSA 300 
STVD VEEMME DDKVDLNUV AURY1VLEE EDGAILVFLP GWDN1STLHD LLMSQVMFKS 360 
DKHJIPLHS LMPTVNQTQV FKRTPPGVRK IVIATN1AET SITIDDWYV HX3GKIKETH 420 
FDTQNNISTM SAEWVSKANA KQRKGRAGRV QPGHCYHLYN GLRAS1XDDY QLPHLRTPL 480 
EELCLQIKIL RLGGIAYFLS RLMDPPSNEA VLLSIRHLME LNALDKQEEL TFLGVHLARL 540 
PVEPfflGKMI LFGALFCCLD PVLTIAASLS FKDPFVTPLG KEKIADARRK ELAKDTRSDH 600 
LTWN AFEGW EEARRRGFRY EKD YCWEYFL SSNTLQMLHN MKGQFAEHLL GAGFVSSRNP 660 
KDPESNINSD NEKIIKAV1C AGLYPKVAKI RLNLGKKRKM VKVYTK1DGL VAVHPKS VNV 720 
EQTDFHYNWL IYHLKMRTSS IYLYDCTEVS PYCLLFFGGD ISIQKDNDQB TTAVDEWIVF 780 
QSPARIAHLV KELRKELDIL LQEKESPHP VDWNDTKSRD CAVLSAHDL DCTQEKATPR 840 
NFPPRFQDGYYS 



SEQ ID N0377 PBY6 DNA SEQUENCE 

NuctefcAddAocessfcm* AA464018 

Codbijj sequence 64-1689(underCned sequence ccrresporute to start and stop codon) 

GATTTTATOC TGGAACATTA CAGTGAAGAT GGCTATTTAT ATQAAGATG A AATTCCAGAT 60 
CTTATGGATC TGAGACAAGC TTGTCGGACG CCTAGCCGGG ATC AGGCCGO GGTGGAACTG 120 
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CTG ATGACAT ACTTCATCCA GCTGOOCTTT OTCG AO AGTC OATTCTTOCC GOCCACACGG 1B0 
CAG ATGGG AC TCCTGTTCAC CTGGTATGAC TCTCTCAOCG GGGTTCCGGT CAGOCAGCAG 240 
AACCTGCTGC TGGAGAAGGC CAGTGTCCTG TTCAACACTG GGGCCCTCTA CAOOCAGATT 300 
GGG ACOOGGT GTGATOGGGA GAGGCAGGCT GGGCTGGAGA GTGCCATAGA TGCCTTTCAG 360 
AGAGCGGCAG GGGTTTTAAA TTACCTGAAA GACACATTTA CCCATACTOC AAGTTACGAC 420 
ATGAGCCCTG CCATGCTCAG OGTGCTCCTC AAAATGATGC TTOCA CAAGC CCAAGAAAGC 480 
GTGTTTGAGA AAATCAGOCT TOCTGGGATC OGGAATGAAT TCTTCATGCT GGTG AAGGTG 540 
GCTCAGG AGO CTGCTAAGGT GGGAGAGGTC TAOCAACAGC TACACGCAGC CATGAGCCAG 600 
GCX3CCGGTGA AAOAGAACAT GCGCTACTCC TGGGCCAGCT TAGCCTGCGT GAAGGOOCAC 660 
CACTACGCGG CCCTGGCCCA CTACTTCACT GCCATCCTCC TCATCGACCA CCAGGTGAAG 720 
CCAGGCAOGG ATCTGGAOCA CCAGGAGAAG TGOCTGTOOC AGCTCTACGA CCACATOCCA 780 
GAGGGGCFGA CACOCTTGGC CACACTGAAG AATGATCAGC AGOGCGGACA GCTGGGGAAG 840 
TOCCACTTGC GCAGAGGCAT GGCTCATCAC GAGGAOTCGG TGCGGG AGGC CAGOCTCTGC 900 
AAGAAGCTGC GGAGCATTG A GGTGCTACAG AAGGTGCTGT GTGCCGCACA GGAAOGCTOC 960 
CGOCTCAOGT ACGCCCAGCA GCAGGAGGAG GATGAOCTGC TGAACCTGAT CGACGCGGCC 1020 
AGTGTTGTro CTAAAACTG A GCAAGAGGTT GACATTATAT TGCCCCAGTT CTCCAAGCTG 1080 
ACAGTCACGG A C1 1C1 1C CA GAAGCTGGGC CCCTTATCTG TGTTTTCGGC TAACAAGOGO 1140 
TGGACGCCTC CTCGAAGCAT CCGCTTCACT GCAGAAGAAG GGGACTTGGG GTTCACCTTG 1200 
AGAGGGAACG CCCCCGTTCA GGTTCACTTC CTGGATCCTT ACTGCTCTGC CTCGGTGGCA 1260 
GGAGCCCGGG AAGGAGATTA TATTGTCTCC ATTCAGCTTO TGGATTGTAA GTGGCTGACG 1320 
CTGAGTGAGG TTATGAAGCT GCTGAAGAGC TTTGGCGAGG ACGAGATOGA GATGAAAGTC 1380 
GTG AGCCTCC TGG ACTOCAC ATCATCCATG CATAAT AAG A GTGCCACAT A CTOCGTGGGA 1440 
ATGCAGAAAA CGTACTCCAT GATCTGCTTA GCCATTGATG ATGACGACAA AACTG ATAAA 1500 
ACCAAGAAAA TCTCCAAGAA GCTTTCCTTC CTGAGTTGGG GCACCAACAA GAACAGACAG. 1560 
AAGTCAGCCA GCACCTTGTG CCTCCCATCG GTCGGGGCTG CACGGCCTCA GGTCAAGAAG 1620 
AAGCTGOCCT CCCCnTCAG CCTTCTCAAC TCAGACAGTT CTTCGTACL1A 



SEQIDN^PPYSP 
Protein Accession*: NPJ49094 

DFILEHYSED GYLYEDOAD LMDLRQACRT PSRDEAG VBL LMTYHQLGF VESRFFPPTR 60 
QMGLLFTWYD SLTGVPVSQQ NLLLEKAS VL FNTGALYTQI GTRCDRQTQA GLESAIDAFQ 120 
RAAG VLNYLK DTFTHTPS YD MSPAMLS VLV KMMLAQ AQES VFEKISLPGI RNEFFMLVKV 180 
AQEAAKVGEV YQQLHAAMSQ APVKENIPYS WASLACVKAH HYAALAHYFT AILUDHQ VK 240 
PGTDLDHQEK CLSQLYDHMP EGLTPLATLK NDQQRRQLGK SHLRRAMAHH EESVREASLC 300 
KKLRSIEVLQ KVLCAAQERS RLTYAQHQEE DDLLNUDAP SWAKTEQEV DXILFQF5KL 360 
TVTDFFQKLG PLS VFSANKR WTPFRSIRFT AEEGDLGFTL RGNAPVQVHF LDPYCS AS VA 420 
GAREGDYIVS IQLVDCKWLT LSEVMKLLKS FGEDEIEMKV VSLLDSTSSM HNKS ATYSVG 480 
MQKTYSM1CL AIDDDDKTDK TKKISKKLSF LSWGTNKNRQ KSASTLCLPS VGAARPQVKK 540 
KLPSPFSLLN SDSSWY 



8EQ ID N0279 PBY8 DNA SEQUENCE 

>^cAcM Accession* AF107493 

Coding sequence: 125-556 (underlined sequence corresponds to start and stop codon) 

1 U 21 31 41 51 

I I I I I I 

GAATTCGGCA CGAGCCTTGT TGGAGGTTCT GGGGCOCAQA ACCGCTACTG CTGCTTCGOT 60 

CTCTCCTTGO GAAAAAATAA AATTTGAACC TTTTGGAGCT GTGTGCTAAA TCTTCAGTGG 120 

GACAATCGGT TCAGACAAAA GAGTGAGTAG AACAGAGCGT AGTGGAAGAT ACGGTTCCAT 180 

CATAGACAGG GATGACCGTG ATGAGCGTGA AtCCCGAAGC AGGCGGAGGG ACTCAGATTA 240 

CAAAAGATCT AGTGATGATC GGAGGGGTQA TAGATATGAT GACTACCGAG ACTATGACAG 300 

TCCAGAOAGA GAGCGTGAAA GAAGGAACAG TGACCGATCC GAAGATGGCT ACCATTCAGA 360 

TGGTGACTAT GGTGAGCACG ACTATAGGCA TGACATCAGT GACGAGAGGG AGAGCAAGAC 420 

CATCATGCTG CGCGGCCTTC CCATCACCAT CACAGAGAGC GATATTCGAG AAATGATGGA 480 

GTCCTTCGAA GGCCCTCAGC CTGCGGATGT GAGGCTGATG AAGAGGAAAA CAGGTGAGAG 540 

CTTGCTTAGT TCCTGATATT ATTGTTCTCT TCCCCATTCC CACCT CAOTC CCTAAAGAAC 600 

ATCCTGATTC CCCCAGTCTT CAAGCACATG AATTCAGAAT GAAAGGTTTG CCATGGCTAA 660 

GGAATGTGAC TCTTTGAAAA CCATGTTAGC ATCTGAGGAA CTTTTTTAAA CTTTGTTT^ 720 

GGGACTTTTT TTTCCTTAGG TAAGTAATGA TTTATAAACT C CTTTTTTTT TTTGACTATA 780 

GTCGGTTGCA TGGTTACTTT AAGCGTGGAA TCAAATGGAG TGGCATTTAG TTCAGGCGGC 840 

TTGTTCCTTG CCATGGCAAA GTATCAAGAA GATCCCCAAG TCAAGTCACA TTTGTAAAGC 900 

TGCTTCCCAA TTGGCTTTGT CACGCAGTGT TGAAGCAGTG GGAGAGAGAT TCACCTGTTA 960 

TAAAGGAACT GACTAACACA AGTATCCCGT CTATATCTGA ' ATGCTGTCTC TAGGTGTAAG 1020 

CCGTGGTTTC GCCTTCGTGG AGTTTTATCA CTTGCAAGAT GCTACCAGCT GGATGGAAGC 1080 

CAATCAGGTT GCTTCACTCA CCAAGTCTAG ATATTCATGA AAAT GGAACA AGTCTGTACA 1140 

ATTTTAAAAA AAGGTTGAAG GAGTGGTTTG TTCCAAAGGA GTGACTTTTT TTTAAAAAAA 1200 

AAGCTTTGTA TATATTAAAA TTGATGTTAC TAGAATAAGT ACAGTACCAA GGACTTCATT 1260 

ATAGAATTTG TTCTGCCTTT AAACATGGCT ACCTACCTGG CAGGGCTTTG TTAACTACTG 1320 

AATACCTGTC TGGTAATCAC TAAAACATCT TTATGTTTCC CTTTTTTCTA GTTTGTTATA 1380 

TTCCTATTAT GTCCATTGAG AGTAAGCTTA GTATATCAAA CTCTCCATTT GACAGTGAAG 1440 

AGAACATAGT GAAAGTCTGT GGCGGCATTT TTATAAGTAA TTCCTTATTT CTGCCTGAAQ 1500 

ACCACAAAGC CTCCTGGAGG CGTAACTGCT CAGACCGGTC TTCAGGGAAT ATTTAAGQAC 1560 

TTAGTGGAAT TTATGAACAA TAAGTCTGAT GAGATTAGCC TGGSAtTTGGT GTCCTGCAGC 1620 

TGTCTAATCT AGAGTGGCAT TAACATTCTA ATCTCCTTGA GAATGCCTTT TATAGTCTGT 1680 

TCAAAGCAAG TCATTGATGG TTCTTCGAGG TAGTGTTAAC TGAAGTGTTC TTCAGTTTGT 1740 

CAAGATAATG TTCAGTGCTT GGCACTTAAA TAACATTTTT TGCAAGAACT CCAAGGCACA 1800 
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TTATTQAATQ CCTTTAACCA AGTGCATTCT 
TTCTGCAGCA TTCTGTGATT TGAGTCATCC 
GATTGGTAAT ATTGOCATTT ATAACAAGAC 
GATTTGTTAA AGTTTTTAAG CCTCTCATTT 
TTAAAAGTAQ AGCTTCATTC ATTTCATACC 
ATACAAGGTT CATOTGAGTC TGCTTTCTTO 
TGTCAGAATG ACTAACCTAG GAGTTTGAAA 
TAAAAGTCTC CACAATTTTA ATGTATACAA 
CAAATTCACT CCAGAAATAA AAGGCCAGTA 
TCCCAGCACA CATCCCTCCT AGTGGGAIGA 
TTTG C TT C T Q TATATCACAG TGAGTGGAT6 
ATGCAGTCTT GCCTTTAGAT ATCGCAGAGA 
GGATTTGCAA GAACCAAATT GCTCAACAGT 
TAAAATCTGQ ATATCTAACC ACCTACTTAA 
CCTTGATCCT CCCACCCCCA AAAAAAAAAA 



GGGAAGTTTG CTTGACTCAT TATCTTOCTT 1860 

ATGAATCCAT GAATAAAAGT TACATTCTTT 1920 

TCACTAATGA GGGTATCACT TTGACTGACT 1980 

TCCTAACCCA GAAATCACAG CCTGATTTTA 2040 

ATAGATACCA TCCTAGTAAA TCCAGAACAT 2100 

ACATGATAGC ATTGTTTGAT GCAGTGGATA 2160 

CTCCTAAGAA ACTAAAAOCT GTAAGACATT 2220 

AGCTATGTTA CTGTGTAACA CA3TACAGTT 2280 

GGATTAGGGA C7CACTGGTA GTTTCGAGTC 2340 

TCTATTCACA TATCTCCCAG CTTTTTTATT 2400 

GCCCTTCAGC TTTTTCTCTC CTGGCCAGAC 2460 

CAAAATTCAC AGCATGTCTT AAATCTTCCA 2520 

ATGTATGTTT AGAGGGGTTA GACTCCITTT 2580 

ATCTGTTTGA TAGTGTCAAA CCACCCCCAC 2640 
AAAA 



SEQ fP NftffiP PPY9 Proton PWMgnra 
Protein Accession* XP.003261 

MGSDKRVSRT ERSGRYGSD DRDDRDERES RSRRRDSDYK RSSDDRRGDR YDDYRDYDSP 60 
ERERERRNSD RSEDG YHSDG DYGEHDYRHD ISDERESKTI MLRGLPITIT ESDIREMMES 120 
FEGPQPADVR LMKRKFGESLLSS 



SEQ ID N0281 PCI2 DNA SEQUENCE 

Nucleic AcW Accession* AF208291 

Coding sequence: 109-3705 (undefined sequence corresponds to start and slop codon) 

1 11 21 31 41 51 

I I I I I I 

CGGCCGCTTT TTTCTCAAGA TGGCAGATTC CCACTOAGGC TGAGGGGGCC GAGCTCGC GC 60 

GCCGCGTTCC CTTCTCCgfT GCCATGAACC GCGGACACCC CGGCCCCGAT GG CCCCOGTO 120 

TACGAAGGTA TGGCCTCACA TGTGCAAGTT TTCTCCCCTC ACACCCTTCA ATCAAGTGCC 180 

TTCTGTAGTG TGAAGAAACT AAAAGTAGAG CCAAGTTCCA ACTGGGACAT GACTGGGTAC 240 

GGCTCCCACA GCAAAGTGTA CAGCCAGAGC AAGAACATAC CACCTTCTCA GCCAGCCTCC 300 

ACAACCGTCA GCACCTOCTT GCCGOTCCCA AACCCAAGCC TACCTTACGA GCAGACCATC 360 

GTCTTOCCAG GAAGCACCGG GCACATCGTG GTCACCTCAG CAAGCAGCAC TTCTGTCACC 420 

GGGCAAGTOC TCGGCGGACC ACACAACCTA ATGCGTCGAA GCACTGTGAG CCTCCTTGAT 480 

ACCTACCAAA AATGTGGACT CAAGCGTAAG AGCGAGGAGA TCGAGAACAC AAGCAGCGTG 540 

CAGATCATCG AGGAGCATCC ACCCATGATT CAGAATAATO CAAGCGGGGC CACTGTCGCC 600 

ACTGOCACCA CGTCTACTGC CAOCTCCAAA AACAGCGGCT CCAACAGCGA GGGCGACTAT 660 

CAGCTGGTGC AGCATGAGGT GCTGTGCTCC ATGACCAACA CCTACGAGGT CTTAGAGTTC 720 

TTGGGCCGAG GGACGTTTGG ACAAGTGGTC AAGTGCTGGA AACGGGGCAC CAATGAGATC 780 

GTAGCCATCA AGATCCTGAA GAACCGCCCA TCCTATGCCC GACAAGGTCA GATTGAAGTG 840 

AGCA TOCTGG CCCGGTTGAG CACGGAGAGT GCC GAJGACT ATAACTTCGT CCGGGCCTAC 900 

GAATGCTTCC AGCACAAGAA CCACACGTGC TTGGTCTTCG AGATGTTGGA GCAGAACCTC 960 

TATGACTTTC TGAAGCAAAA CAAGTTTAGC CCCTTGCCCC TCAAAIACAT TCGCCCAGTT 1020 

CTCCAGCAGG TAGOCACAGC CCTGATGAAA CTCAAAAGCC TAGGTCTTAT CCACGCTCAC 1080 

CTCAAAOCAG AAAACATCAT GCTGGTGGAT CCATCTAGAC AACCATACAG AGTCAAGGTC 1140 

ATCGACTTTG GTTCAGCCAG CCACGTCTCC AAGGCTGTGT GCTCCACCTA CTTGCAGTCC 1200 

AGATATTACA GGGCCOCTGA GATCATCCTT GGTTTACCAT TTTGTGAGGC AATTGACATG 1260 

TGGTOCCTGG GCTGTGTTAT TGCAGAATTG TTCCTSGOTT GGCCGTTATA TCCAGGAGCT 1320 

TCGGAGTATG ATCAGATTCG GTATATTTCA CAAACACAGG GTTTGCCTGC TGAATATTTA 1380 

TTAAGCGCCG GGACAAAGAC AACTAGGTTT TTCAACCGTG ACACGGACTC ACCATATOCT 1440 

TTGTGGAGAC TGAAGACACC AGATGACCAT GAAGCAGAGA CAGGGATTAA GTCAAAAGAA 1500 

GCAAGAAAGT ACATTTTCAA CTGTTTAGAT GATATGGCCC AGGTGAACAT GACGACAGAT 1560 

TTGGAAGGGA GCGACATGTT GGTAGAAAAG GCTGACCGGC GGGAGTTCAT TGACCTGTTG 1620 

AAGAAGATGC TGACCATTGA TGCTGACAAG AGAATCACTC CAATCGAAAC CCTGAACCAT 1680 

CCCTTTGTCA CCATGACACA CTTACTCGAT TTTCCCCACA GCACACACGT CAAATCATGT 1740 

TTCCAGAACA TGGAGATCTG CAAGCGTCGG GTGAATATGT ATGACACGGT GAACCAGAGC 1800 

AAAACCCCTT TCATCACGCA CGTGGCCCCC AGCACGTCCA CCAACCTGAC CATGACCTTT 1860 

AACAACCAGC TGACCACTGT CCACAACCAG GCTCCCTCCT CTACCAGTGC CACTATTTCC 1920 

TTAGCCAATC CCGAAGTCTC CATACTAAAC TACCCATCTA CACTCTACCA GCCCTCAGCG 1980 

GCATCCATGG CTGCAGTGGC CCAGCGGAGC ATGCCCCTCC AGACAGGAAC AGCCCAGATT 2040 

TGTGCCCGGC CTGACCCGTT CCAGCAAGCT CTCATCGTGT GTOCCCCCGQ CTTCCAAGGC 2100 

TTGCAGGCCT CTCCCTCTAA GCACGCTGGC TACTCGGTGC GAATGGAAAA TGCAGTTCCC 2160 

ATCGTCACTC AAGCCOCAGG AGCTCAGCCT CTTCAGATCC AACCAGGTCT GCTTGCCCAG 2220 

CAGGCTTGGC CAAGTGGGAC CCAGCAGATC CTGCTTCCCC CAGCATGGCA GCAACTGACT 2280 

GGAGTGGCCA CCCACACATC AGTGCAGCAT GCCACCGTGA TTCCCGAGAC CATGGCAGGC 2340 

ACCCAGCAGC TGGCGGACTG GAGAAATACG CATGCTCACG GAAGCCATTA TAATCCCATC 2400 

ATGCAGCAGC CTGCACTATT GACCGGTCAT GTGACCCTTC CAGCAGCACA GCCCTTAAAT 2460 

GTGGGTGTGG CCCACGTGAT GCGGCAGCAG CCAACCAGCA CCACCTCCTC CCGGAAGAGT 2520 

AAGCAGCACC AGTCATCTGT GAGAAATGTC TCCACCTGTQ AGGTGTCCTC CTCTCAGGCC 2580 

ATCAGCTCCC CACAGCGATC CAAGCGTGTC AAGGAGAACA CACCTCCCCG CTGTGCCATG 2640 

GTGCACAGTA GCCCGGCCTG CAGCACCTCG GT CA CCTGTG GGTGGGGCGA CGTGGCCTCC 2700 

AGCACCACCC GGGAACGGCA GCGGCAGACA ATTGTCATTC CCGACACTCC CAGCCCCACG 2760 

GTCAGCGTCA TCACCATCAG CAOTGACAOG GACGAGGAGG AGGAACAGAA ACACGCCCCC 2820 

ACCAGCACTG TCTCCAAGCA AAGAAAAAAC GTCATCAGCT GTGTCACAGT OCACGACTCC 2880 

CCCTACTCCG ACTCCTCCAG CAACACCAGC CCCTACTCCG TGCAGCAGOG TGCTGGGCAC 2940 
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AACAATGCCA ATGCCTTTGA CACCAAGGGG AGCCTGGAGA ATCACT6CAC GGGGAACCCC 3000 

CGAACCATCA TCGTGCCACC CCTGAAAACC CAGGCCAGCG AAGTATTGGT GGAGTGTGAT 3060 

AGCCTGGTGC CAGTCAACAC CAGTCACCAC TCGTCCTCCT ACAAGTCCAA 6TCCTCCAGC 3120 

AACGTGACCT CCACCAGCGG TCACTCTTCA GGGAGCTCAT CTGGAGCCAT CACCTACCGG 3180 

CAGCA6CGGC C6GGCCCCCA CTTCCAGCAQ CAGCAGCCAC TCAATCTCAQ CCAGGCTCAQ 3240 

CAGCACATCA CCAC66ACCG CACTGGGAGC CACCGAAGGC AGCAQSCCTA CATCACTCCC 3300 

ACCATGGCCC AGGCTCCGTA CTCCTTCCCG CACAACAGCC CCA6CCACG6 CACTGTGCAC 3360 

CCGCATCTGG CTGCAGCCGC TGCCGCTGCC CACCTCCCCA CCCAGCCCCA CCTCTACACC 3420 

TACACTGCGC 066CG6CCCT GGGCTCCACC GGCACCGTGG CCCACCTGGT GGCCTCGCAA 34S0 

GGCTCTGCGC GCCACAOCGT GCAGCACACT GCCTACCCAG CCAGCATCGT CCACCAGGTC 3540 

CCCGTGAGCA T66GCO0C06 GGTCCTGCCC TCGCCCACCA TCCACCCGAG TCAGTATCCA 3600 

GCCCAATTTG CCCACCAGAC CTACATCAGC GCCTCGCCAO CCTCCACOGT CTACACTGGA 3660 

TACCCACTGA GCCCCGCCAA GGTCAACCAQ TACCCTTACA TATAAACACT GGAG6GGA6G 3720 

GAGGGAGGGA GG6AGGGAGA GAATGGCCCO AOGGAGGAGO GAGAGAAGGA 066A8GCGCT 3780 

OCTGGGAOCG T66GC6CT6G CCTTTTATAC TQAAQATGCC GCACACAAAC AAT6CAAACQ 3840 

0GGCA6G6GC GG6GG60G00 GGG6CA6AGG GCAGGGGGAC GGGTCOGOAC ACCAGTGAAA 3900 

CTTGAACCGO GAAGTGGGAG GAC GTACAGC AGAGAAGAGA ACATTTTTAA AAGGAAGGGA 3960 
TTAAAGAGGG TGGGAAATCT ATGGTTTTTA TTTTAAAAAA 



SEQ B> NQ:2fi2 P02 Protein seooence: 
Protein Accession #: NPJD73S77 

MAFVYEGMAS HVQVFSPHTL QSS AFCSVKK LKVEPSSNWD MTGYG5HSKV YSQSKNIPPS 60 
QPASTTVSTS LPYPNPSLPY EQTIVFPGST GHIWTSASS TSVTGQVLGG PHNLMRRSTV 120 
SLLDTYQKCG LKRKSEHEN TS5 VQQEEH FPMIQNNASG ATVATATTST ATSKNSGSNS 180 
EGDYQLVQHE VLCSMTNTYE VLEFLGRGTF GQWKCWKRG TNEIVAHOL KNRPSYARQG 240 
QEVSILARL STES ADDYNF VRAYECFQHK NHTCLVFEML EQNLYDFLKQ NKFSPLPLKY 300 
RPVLQQVAT ALMKLKSLGL IHADLKPENI MLVDPSRQPY RVKVIDFGS A SHVSKAVCST 360 
YLQSRYYRAP EULGLPFCE AIDMWSLGCV IAELFLGWPL YPGASEYDQI RYISQTQGLP 420 
AEYLLSAGTK TTRFFNRDTD SPYPLWRLKT PDDHEAETGI KSKBARKYIF NCLDDMAQVN 480 
MTIDLEGSDM LVEKADRREF IDLLKKMLTI DADKRITPE TLNHPFVTMT HLLDFPHSTH 540 
VKSCPQNMO CKRRVNMYDT VNQSKTPFIT HVAPSTSTNL TMTFNNQLTT VHNQ APSSTS 600 
ATISLANPEV SILNYPSTLY QPS AASMAAV AQRSMPLQTG TAQICARPDP FQQALIVCPP 660 
GFQGLQASPS KHAGYSVRME NAVPIVTQAP GAQPLQIQPG LLAQQAWPSG TQQILLPPAW 720 
QQLTGVA1HT SVQHATVIPB TMAGTQQLAD WRNTHAHGSH YNP1MQQPAL LTGHVTLPAA 780 
QPLNVGVAHV MRQQPTSTTS SRKSKQHQSS VRNVSTCEVS SSQAISSPQR SKR VKENT PP 840 
RCAMVHSSPA CSTS VTCGWG DVASSTTRER QRQTIVIPDT PSPTVSVITI SSDTDEEEEQ 900 
KHAPTSTVSK QRKNVISCVT VHDSPYSDSS SNTSPYSVQQ RAGHNNANAF DTKGSLENHC 960 
TGNPRTHVP PUCTQASEVL VECDSLVPVN TSHHSSSYKS KSSSNVTSTS GHSSGSSSGA 1020 
nYRQQRPGP HFQQQQPLNL SQAQQHITTD RTGSHRRQQA YTTFIMAQAP YSFPHNSPSH 1080 
GTVHPHLAAA AAAAHLPTQP HLYTYTAPAA LGSTGTVAHL VASQGSARHT VQHTAYPASI 1140 
VHQVPVSMGP RVLPSPTIHP SQYPAQFAHQ TYB ASPAST VYTGYPLSPA KVNQYPYI 

SEQ ID N0283 PBY1 DNA SEQUENCE 

Kx^c Add Accession*: NM.017700 

Coding sequence 147-806 (undefined sequence corresponds to start end slop codon) 

1 11 21 31 41 51 

I I I I I I 

AGTCACAGCC AGGTAACCCT GGAGTGAAGC GGTTTAGTTA GAAGGGAGCA GATAAACTCG 60 

TCACTCTAGT AGCTTTAACC CTCACCCTGA GGCACCTTAG CAATCAGCCA TTGCCTGCAA 120 

GCCTCCAAAG CTTGT CTTT G CCTAATATGG AGCCCAAAGA AGCCACTGGG AAAGAAAACA 180 

TGGTCAOCAA GAAAAAGAAT CTGGOCTTCT TGAGGTCTAG ACTCTATATG CTGGAGAGAA 240 

GGAAGACTGA CACTGTGGTT GAGAGCAGTG TTTCTGGGGA CCACTCTGGC ACCTTGAGGA 300 

GGAGCCAATC TOACA GGAC C GAATACAACC AGAAATTACA AGAAAAGATG ACTCCACAGG 360 

GTGAGTGTTC TGTAGCTGAG ACCTTAACCC CAGAGGAAGA GCATCATATG AAGAGGATGA 420 

TGGCAAAGCG GGAAAAGATC ATTAAGGAGC TGATACAGAC AGAAAAGGAT TATCTCAATG 480 

ATCTAGAGCT GTGTGTTAGG GAAGTGGTTC AGCCCCTGAG AAATAAAAAG ACTGATAGGC 540 

TGGATOTGGA TAGCTTGTTT AGCAACATTG AGTCCGTGCA TCAGATATCA GCCAAGCTGC 600 

TGTCATTGTT GGAAGAGGCC ACAACAGACG TGGAAOCGGC CATGCAAGTA ATTGGAGAAS 660 

TATTCTTGCA GATTAAAGGG CCACTGGAAG ATATTTATAA AATCTACTGC TATCACCATG 720 

ATGAAGCACA TAGTATACTG GAGTCCTATG AAAAGGAAGA AGAGCTGAAG GAACATTTGA 780 

GCCACTGTAT CCAGTCCTTA AAGTAAQGCC 1TTTCAAATG ATGATTCCCA TCTCCTCTCA 840 

GTTGCCTAGC AGGGAACATT TTAAATGGAT GTAGATGAAA GGTCTCACAT AAATCCTATG 900 

TTTTATGAGA CTTGCTGGGA GCTCTGCTTT GCAT TDCCTT TATAAAAAGC TGACATGCCA 960 

GAAGCCCTGA TTGACTTTTT PTCCCCCTGC GAGAATQACT AAAAATAACA TGGAAGAAGA 1020 

TTTAGAGCTC TGCAGCGATT GAAAAATGCA ATATCAAAAT ATAAAATGTG GAAGAAAAGC 1080 

CTCTTCTTAA AGCTATTGTA ACTTGCCTGG CCCCACGTAG TTCAAGGATT ATGTGAGATA 1140 

ACACGTGGOC CCATGACCAC TGGAGCACAT GGGTTAATGG AGTTAGGGGA ATGGCCTACA 1200 

ACTCTGCATG GCCGTCTTCT TTCCCCAAAC TCACTQTGGG GAGATGGGTG AAGACAAGTC 1260 

AGGCCTTGTT AAAGTTAGTT TCAGAACAAT TACTCATGCC TTCCTTTCTC ATCCCTAAAA 1320 

CATTGGTGGQ GGAGCTACAC AATGTACTTT TTCTTTTCTA GAGGAAGTAT CTATTCACTG 1380 

TGAAAATCTG AAAAATATAA CAAAGTATOT GTAAGATAAA AACCCCTTGC TATTTCAAAA 1440 
AAAAAAAAAA AAAAAAAAAA AAAA 

SEQ tP ttOSBA PBY1 Pnrtetn sequence; 
Protein Accession #: NPJD60170 

1 11 21 31 41 51 
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I I I I I I 

KEPKEATGKE NMVTKKKNLA FLRSRLYMLE RRKTDTWES SVSGDHSGTL RRSQSDRTSY 60 

NQKLQEKMTP QGECSVAETL TPEBEHHMKR MMAKREKIIK ELIQTEKDYL NDLELCVREV 120 

VQPLRNKKTD RLDVDSLFSN IESVHQISAX LLSLLEEATT DVEPAKQVIG EVFLQIKGPL 180 
EDIYKiyCYH HDEAHSBUES YEKEEELKEH LSHCIQSLK 



8EQ ID HCmS PBQ9 DNA SEQUENCE 

Nudefc Acid Accession* X66534 

Coding sequence: 523-2076 (underfilled sequence corresponds to atari and stop codon) 



1 11 21 31 41 51 

I I 1 I I I 

CCCTTATGGC GATTGGGCGG CTGCAGAGAC CAGGACTCAG TTCCCCTGCC CTAGTCTGAG 60 

CCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAQAAG CAGGTTTCAG TTGCAGAGTT 120 

TTCCTACACT TTTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGCGGAGGAC 180 

ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACCC CAGCCGGGCG TGATCTCACC 240 

ATGTGCGGAT TTGCGAGQCG CGCCCTGGAG CTCCTAGAGA TCCGQAAGCA CAGCCCCGAG 300 

GTGTGCGAAG CCACCAAGAC TGCGGCTCTT GGAGAAAGCG TGAGCAGGGG GCCACCGCGG 360 

TCTCCGGCCT GTCTGCACCC TGTCGCCTGA GCTGCCTGAC AGTGACAATG ACATCCCAGT 420 

TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TCTCATATAA GAACTACAGC 480 

TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA C CATGT TCTG CACGAAGCTC 540 

AAGGATCTCA AGATCACAGG AGAGTGTCCT TTCTCCTTAC TGGCACCAGG TCAAGTTCCT 600 

AACGAGTCTT CAGAGGAGGC AGCAGGAAGC TCAGAGAGCT GCAAAGCAAC CGTGCCCATC 660 

TGTCAAGACA TTCCTGAGAA GAACATACAA GAAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 

AGCCGAGTCT ATCTTCACAC TTTGGCAGAG AGTATTTGCA AACTGATTTT CCCAGAOTTT 780 

GAACGGCTGA ATGTTGCACT TCAGAGAACA TTGGCAAAGC ACAAAATAAA AGAAAGCA6G 840 

AAATCTTTGG AAAGAGAAGA CTTTGAAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 

CCAGTGGAGT TATCAAAGAA TCTCTTGGTG AAGAGGTTTT TAAAATATGT TACGAGGAAQ 960 

ATGAAAACAT CCTTGGGGTG GTTGGAGGCA CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 

CCTTCTGAAA CAGAGCAGCC ATTGCCAAGA AGCAGGAAAA AGGGGCAGCT TGAGGACGCC 1080 

TCCATTCTAT GCCTGGATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 

AGAACCACCT CCCTGATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATGAA 1200 

ACGGAAGTGG AAGTGTCGTT AATGCCTCCC TGCTTCCATA ATGATTGCAG CGAGTTTGTO 1260 

AATCAGCCCT ACTTGTTGTA CTCCGTTCAC ATGAAAAGCA CCAAGCCATC CCTGTCOCCC 1320 

AG CAAA CCCC AGTCCTCGCT GGTGATTCOC ACATCGCTAT TCTGCAAGAC ATTTCCATTC 1380 

CATTTCATGT TTGACAAAGA TATGACAATT CTGCAATTTG GCAATGGCAT CAGAAGGCTG 1440 

ATGAACAGGA GAGACTTTCA AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1500 

AAAATCAACC AGACCTTTAG OGGGATCATG ACTATGTTGA ATATGCAGTT TGTTGTACGA 1560 

GTGAGGAGAT GGOACAACTC TGTGAAGAAA TCTTCAAGGG TCATGGACCT CAAAGGCCAA 1620 

ATGATCTACA TTGTTGAATC CAGTGCAATC TTGTTTTTGG GGTCACCCTG TGTGGACAGA 1680 

TTAGAAGATT TTACAGGACG AGGGCTCTAC CTCTCAGACA TCCCAATTCA CAATGCACTG 1740 

AGGGATGTGG TCTTAATAGG GGAACAAGCC CGAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1800 

GGGAAGCTGA AGGCTACCCT TGAGCAAGCC CACCAAGCCC TGGAGGAGGA GAAGAAAAAG 1860 

ACAGTAGACC TTCTGTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 

CAAGTTGTGC AAGCCAAGAA GTTCAGTAAT GTCAOCATGC TCTTCTCAGA CATCGTTGGG 1980 

TTCACTGCCA TCTGCTCCCA GTGCTCACCG CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 

TACACTCGCT TCGACCAGCA GTGTGGAGAG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 

ATGCCTATTG TGTGGCTTGG GGGATTACAC AAAGAGAGTG ATACTCATGC TGTTCAGATA 2160 

GCGCTGATGG COCTGAAGAT GATGGAGCTC TCTGATGAAQ TTATGTCTCC CCATGGAGAA 2220 

CCTATCAAGA TGCGAATTGG ACTG CACTCT GGATCAGTTT TTGCTGGCGT CGTTGG AGTT 2280 

AAAATGCCCC GTTACTGTCT TTTTGGAAAC AATGTCACTC TGGCTAACAA ATTTGAGTCC 2340 

TGCAGTGTAC CACGAAAAAT CAATGTCAGC CCAACAACTT ACAOATTACT CAAAGACTGT 2400 

CCTGGTTTCG TGTTTACCCC TCGATCAAGG GAGGAACTTC CACCAAACTT CCCTAGTGAA 2460 

ATCCCCGGAA TCTGCCATTT TCTGGATGCT TACCAACAAG GAACAAACTC AAAACCATGC 2520 

TTCCAAAAGA AAGATGTGGA AGATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAGA 2580 

TTAGCAACCT ATATACCTAT TTATAAGTCT TTGGGGTTTG ACTCATTGAA GATGTGTAGA 2640 

GCCTCTGAAA GCACTTTAGG GATTGTAGAT GG CTAAC AAG CAGTATTAAA ATTTCAGGAG 2700 

CCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 

TCTTCAAGAA AAAAAAAAAA ACCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2820 

AACCAGCACT TACTACCTGT ACTCAAAATT C^GCACCTTG TACATATATC AGATAATTGT 2880 

AGTCAATTGT ACAAACTGAT GGAGTCACCT GCAATCTCAT ATCCTGGTGG AATGCCATGG 2940 

TTATTAAAGT GTGTTTGTGA TAGTTGTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
AAAA 

SEP P Nfr2gS PBQ9 Protein sequence: 
Protein Accession*: QQZ108 

1 11 21 31 41 51 

I 1 I I I I 

MFCTKLKDLK ITGECPFSLL APGQVPNESS EBAAGSSESC KATVPICQDI PEKNIQKSLP 60 

QRKTSRSRVY LETLAESICK LIFPEFERLN VALQRTLAKH KIKESRKSLE REDFEKTIAE 120 

QAVAAGVPVE VTKESLGEEV FKICYEEDEN ILGWGGTLK DPLNSPSTLL KQSSHCQEAG 180 

KRGRLEDASI LCLDKEDOPL HVYYFFPKKT TSLILPGIIK AAAHVLYETE VEVSLMPPCF 240 

HNDCSEFVNQ PYLLYSVHMK STKPSLSPSK PQSSLVIPTS LF CKTFPFHP MFDKDMTILQ 300 

FGNGIRRLMN RRDFQGKPNP EBYFECLTPK INQTFSGIMT MLNHQPWKV RRWDNSVKKS 360 

SRVHDLKGQM IYIVBSSAZL FLGSPCVDRL EDFTGRGLYL SDIPIHNALR DWLIGEQAR 420 

AGDGLKKRLG KLKATLEOAH OALEEEKKKT VDLLCSIPPC EVAQQLWQGQ WQAKKFSNV 480 

THLFSDIVGP TAICSQCSPL QVITHLNALY TRFDQQCGEL DVYKVETIGD AYCVAGGLHK 540 
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ESDTHAVOIA LMALKHMELS DEVMSPHGEP XKHRIGLHSG SVFA6WGVK MPRYCLFGNN 600 
VTLANKFESC SVPRKINVSP TTYRLLKDCP GFVFTPRSRE ELPPNFPSEI PGICHFLDAY. 660 
QQGTNSRPCF QKKDVEDGNA NFLGKASGXD 



SEQ ID N0287 PFD2 DMA SEQUENCE 

Nucleic Add AccesstorA KM_000720 

Cooing sequence: 119-6664 (undefined sequence corresponds to start end stop codon) 



1 11 21 31 41 51 

I I I I ! I 

AGAATAAGGG CAGGGACCGC GGCTCCTATC TCTTGGTGAT CCCCTTCCCC ATTCCGCCCC 60 

CGCCTCAACG CCCAGCACAG TGCCCTGCAC ACAGTAGTCG CTCAATAAAT OTTCQTGGAT 120 

GATGATGATG ATGATGATGA AAAAAATGCA GCATCAACGG CAGCAGCAAG CG6ACCACGC 180 

GAACGAGGCA AACTATGCAA GAG6CACCA6 ACTTCCTCTT TCTGGTGAAG GACCAACTTC 240 

TCAGCCGAAT AGCTGCAAGC AAACTGTCCT GTCTTGGCAA GCTGCAftTOQ ATGCTGCTAG 300 

ACAGGCCAAG GCTGCCCAAA CTATGAGCAC CTCTGCACCC CCACCTGTAG GATCTCTCTC 360 

CCAAAGAAAA CGTCAGCAAT ACGCCAAGAG CAAAAAACAG GGTAACTCGT CCAACAGCCO 420 

ACCTGCCCGC GCCCTTTTCT GTTTATCACT CAATAACCCC ATCCGAAGAG CCTGCATTAG 480 

TATAGTGGAA TGGAAACCAT TTGACATATT TATATTATTQ GCTATTTTTG OCAATTGTGT 540 

GGCCTTAGCT ATTTACATCC CATTCCCTGA AGATGATTCT AATTCAACAA ATCATAACTT 600 

GGAAAAAGTA GAATATGCCT TCCTGATTAT TTTTACAGTC GAGACATTTT TGAAGATTAT 660 

AG06TATGGA TTATTGCTAC ATCCTAATGC TTATGTTAGO AATGGATGQA ATTTACTGGA 720 

TTTTGTTATA GTAATAGTAG GATTGTTTAQ TGTAATTTTG GAACAATTAA OCAAAGAAAC 780 

AGAAGGCGGG AACCACTCAA GCGGCAAATC TGGAGGCTTT GATGTCAAAG CCCTOCGTGC 840 

CTTTCGAGTG TTGCGACCAC TTCGACTAGT GTCAGGGGTG CCCAGTTTAC AAGTTGTCCT 900 

GAACTCCATT ATAAAAGCCA TGGTTCCCCT CCTTCACATA GCCCTTTTGG TATTATTTGT 960 

AATCATAATC TATGCTATTA TAGGATTGGA ACTTTTTATT GGAAAAATGC ACAAAACATG 1020 

TTTTTTTGCT GACTCAGATA TCGTAGCTGA AGAGGACCCA GCTCCATGTG CGTTCTCAGG 1080 

GAATGGACGC CAGTGTACTG CCAATGGCAC GGAATGTAGG AGTGGCTGGG TTGGCCOGAA 1140 

OGGAGGCATC ACCAACTTTG ATARCTTTGC CTTTGCCATG CTTACTGTGT OTCAGTGCAT 1200 

CACCATGGAG GGCTGGACAG ACOTGCTCTA CTGGGTAAAT GATGCGATAG GATGGGAATG 1260 

GCCATGGGTG TATTTTGTTA GTCTGATCAT OCTTGGCTCA TTTTTCGTOC TTAACCTGGT 1320 

TCTTGGTGTC CTTAGTGGAG AATTCTCAAA GGAAAGAGAG AAGGCAAAAG CACGGGGAGA 1380 

TTTCCAGAAG CTGCGGGAGA AGCAGCAGCT GGAGGAGGAT CTAAAGGGCT ACTTGGATTG 1440 

GATCACCCAA GCTGAGGACA TCGATCCGGA GAATGAGGAA GAAGGAGGAG AGGAAGGCAA 1500 

ACGAAATACT AGCATGCCCA CCAGCGAGAC TGAGTCTGTG AACACAGAGA ACGTCAGCGG 1560 

TGAAGGCGAG AACCGAGGCT GCTGTGGAAG TCTCTGGTGC TGGTGGAGAC GGAGAGGCGC 1620 

GGCCAAGGCG GGGCGCTCTG GGTGTCGGCG GTGGGGTCAA GCCATCTCAA AATCCAAACT 1680 

CAGCCGACGC TG GC GTC G CT GGAACCGATT CAATCGCAGA AGATGTAGGG CCGCCGTGAA 1740 

GTCTGTCACG TTTTACTGGC TGGTTATOGT CCTGGTGTTT CTGAACACCT TAACCATTTC 1800 

CTCTGAGCAC TACAATCAGC CAGATTGGTT GACACAGATT CAAGATATTG CCAACAAAGT 1860 

CCTCTTGGCT CTGTTCACCT GCGAGATGCT GGTAAAAATG TACAGCTTGG GCCTCCAAGC 1920 

ATATTTCGTC TCTCTTTTCA ACCGGTTTGA TTGCTTCGTG GTGTGTGGTG GAATCACTGA 1980 

GACGATCCTG GTGGAACTGG AAATCATGTC TCCCCTGGGG ATCTCTGTGT TTCGGTGTGT 2040 

GCGCCTCTTA AGAATCTTCA AAGTGACCAG GCACTGGACT TCCCTGAGCA ACTTAGTGGC 2100 

ATCCTTATTA AACTCCATGA AGTCCATCGC TTCGCTGT T Q CTTCTGCTTT TTCTCTTCAT 2160 

TATCATCTTT TCCTTGCTTG GGATGCAGCT GTTTGGCGGC AAGTTTAATT TTGATGAAAC 2220 

GCAAACCAAG CGGAGCACCT TTGACAATTT CCCTCAAGCA CTTCTCACAG TGTTCGAGAT 2280 

C CTGACAGGC GAAGACTGGA ATGCTGTGAT CTACGATGGC ATCATGGCTT ACGGGGGCGC 2340 

ATCCTCTTCA GGAATGATCG TCTGCATCTA CTTCATCATC CTCTTCATTT GTGGTAACTA 2400 

TATTCTACTG AATGTCTTCT TGGCCATCGC TGTAGACAAT TTGGCTGATG CTGAAAGTCT 2460 

GAACACTGCT CAGAAAGAAG AAGCGGAAGA AAAGGAGAGG AAAAAGATTG OCAGAAAAGA 2520 

GAGCCTAGAA AATAAAAAGA ACAACAAACC AGAAGTCAAC CAGATAGCCA ACAOTGACAA 2580 

CAAGGTTACA ATTGATGACT ATAGAGAAGA GGATGAAGAC AAGGACCCCT ATCCGCCTTG 2640 

OGATGTGCCA GTAGGGGAAG AGGAAGAGGA AGAGGAGGAG GATGAACCTG AGGTTCCTGC 2700 

GGGACCCCGT CCTCGAAGGA TCTCGGAGTT GAACATGAAG GAAAAAATTG CCCCCATCCC 2760 

TGAAGGGAGC GCTTTCTTCA TTCTTAGCAA GACCAACCCG ATCCGCGTAG GCTGCCACAA 2820 

GCTCATCAAC CACCACATCT TCACCAACCT CATCCTTGTC TTCATCATGC TGAGCAGCGC 2880 

TGOCCTGGCC GCAGAGGACC CCATCCGCAG CCACTCCTTC CGGAACACGA TACTGGGTTA 2940 

CTTTGACTAT GCCTTCACAG CCATCTTTAC TGTTGAGATC CTGTTGAAGA TGACAACTTT 3000 

TGGAGCTTTC CTCCACAAAG GGGCCTTCTG CAGGAACTAC TTCAATTTGC TGGATATGCT 3060 

GGTGGTTGGG GTGTCTCTGG TGTCATTTGG GATTCAATCC AGTGCCATCT CCGTTGTGAA 3120 

GATTCTGAGG GTCTTAAGGG TCCTGCGTCC OCTCAGGGCC ATCAACAGAG CAAAAGGACT 3180 

TAAGCACGTG GTCCAGTGCG TC T T CC TGGC CATCCGGACC ATCGGCAACA TCATGATCGT 3240 

CACTACCCTC CTGCAGTTCA TGTTTGOCTG TATCGGGGTC CAGTTGTTCA AGGGGAAGTT 3300 

CTATCGCTGT ACGGATGAAG CCAAAAGTAA CCCTGAAGAA TGCAGGGGAC TTTTCATCCT 3360 

CTACAAGGAT GGGGATGTTG ACAGTCCTGT GGTCCGTGAA CGGATCTGGC AAAACAGTGA 3420 

TTTCAACTTC GACAACGTCC TCTCTGCTAT GATGGCGCTC TTCACAGTCT CCACGTTTGA 3480 

GGGCTGGCCT GC GT TQCTGT ATAAAGCCAT CGACTCGAAT GGAGAGAACA TCGGCCCAAT 3540 

CTACAACCAC CGCGTGGAGA TCTCCATCTT CTTCATCATC TACATCATCA TTGTAGCTTT 3600 

CTTCATGATG AACATCTTTG TGGG CTTT G T CATCGTTACA TTTCAGGAAC AAGGAGAAAA 3660 

AGAGTATAAG AACTGTGAGC TGGACAAAAA TCAGCGTCAG TGTGTTGAAT ACGCCTTGAA 3720 

AGCACGTCCC TTGCGGAGAT ACATCCCCAA AAACCCCTAC CAGTACAAGT TCTGGTACGT 3780 

GGTGAACTCT TCGCCTTTCG AATACATGAT G TriiyiXXTC ATCATGCTCA ACACACTCTG 3840 

CTTGGCCATG CAGCACTACG AGCAGTCCAA GATGTTCAAT GATGCCATGG ACATTCTGAA 3900 

CATGGTCTTC ACCGGGGTGT TCACCGTCGA GATGGTTTTG AAAGTCATCG CATTTAAGCC 3960 

TAAGGGGTAT TTTAGTGACG OCTGGAACAC GTTTGACTCC CTCATCGTAA TCGGCAGCAT 4020 

TATAGACGTG GCCCTCAGCG AAGCGGACCC AACTGAAAGT GAAAATGTCC CTGTCCCAAC 4080 
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TQCTACACCT GGGAACTCTG AAGAGAGCAA TAGAATCTCC ATCACCTTTT TCCOTCTTTT 4140 

CCGAGTGATG CGATTGGTGA AGCTTCTCAQ CAGGGGGGAA GGCATCCGGA CATTGCTGTG 4200 

GACTTTTATT AAGTCCTTTC AGGCGCTCCC GTATGTGGCC CTCCTCATAG CCATGCTGTT 4260 

CTTCATCTAT GCGGTCATTG GCATGCAGAT GTTTGGGAAA GTTGCCATQA GAGATAACAA 4320 

CCAGATCAAT AGGAACAATA ACTTCCAGAC OTTTCCCCAO GCGGTGCTGC TGCTCTTCAG 4380 

GTGTCCAACA OGTTGAGGCCT GGCAGGAGAT CAT6CTG60C TGTCTCCCAG GGAAGCTCTG 4440 

TGACCCTGA6 TCAGATTACA ACCCCGGGGA GGAGTATACA TGTGGGAGCA ACTTTGCCAT 4500 

TGTCTATTTC ATCAGTTTTT ACATGCTCTG TGCATTTCTG ATCATCAATC TGTTTGTGGC 4560 

TGTCATCATG GATAATTTCG ACTATCTGAC CCGGGACTGG TCTATTTTGQ OGCCTCACCA 4620 

TTTAGATGAA TTCAAAAGAA TATGGTCAGA ATATGACCCT GAGGCAAAGG GAAGGATAAA 4680 

ACACCTTGAT GTGGTCACTC TGCTTCGACG CATCCAGCCT CCCCTGGGGT TTGGGAAGTT 4740 

ATGTCCACAC AGGGTAGCGT GCAAGAGATT AGTTGCCATG AACATGCCTC TCAACAGTGA 4800 

CGGGACAGTC ATGTTTAATQ CAACCCTGTT TCCTTTGGTT CGAACOGCTC TTAAGATCAA 4860 

GACCGAAGGG AACCTGGAGC AAGCTAATGA AGAACTTCGG GCTGTGATAA AGAAAATTTG 4920 

GAAGAAAACC AGCATGAAAT TACTTGACCA AGTTGTCCCT CCAGCTGGTG ATGATGAGGT 4980 

AACCGTGGGG AAGTTCTATG CCACTTTCCT GATACAGGAC TACTTTAGGA AATTCAAGAA 5040 

ACGGAAAGAA CAAGGACTGG TGGGAAAGTA CCCTGCGAAG AACACCACAA TTGCCCTACA 5100 

GGCGGGATTA AGGACACTGC ATGACATTGG GCCAGAAATC CG GCGT GCTA TATCGTGTGA 5160 

TTTGCAAGAT GACGAGCCT6 AGGAAACAAA ACGAGAAGAA GAAGATGATG TGTTCAAAAG 5220 

AAATGGTGCC CTCCTTGGAA ACCATGTCAA TCATGTTAAT AGTGATAGGA GAGATTCOCT 5260 

TCAGCAGACC AATACCACCC ACCGTCCCCT GCATGTCCAA AGGCCTTCAA TTCCACCTGC 5340 

AAGTGATACT GAGAAACCGC TGTTTCCTCC AGCAGGAAAT TCGGTGTGTC ATAACCATCA 5400 

TAACCATAAT TCCATAGGAA AGCAAGTTCC CACCTCAACA AATGCCAATC TCAATAATGC 5460 

CAATATGTCC AAAGCTGCCC ATGGAAAGCG GCCCAGCATT GGGAACCTTG AGCATGTGTC 5520 

TGAAAATGGG CATCATTCTT COCACAAGCA TGACCGGGAG CCTCAGAGAA GGTCCAGTGT 5580 

GAAAAGAACC CGCTATTATG AAACTTACAT TAGGTCCGAC TCAGGAGATG AACAGCTCCC 5640 

AACTATTTGC CGGGAAGACC CAGAGATACA TGGCTATTTC AGGGAOCCCC ACTGCTTGGG 5700 

GGAGCAGGAG TATTTCAGTA GTGAGGAATG CTACGAGGAT GACAGCTCGC CCACCTGGAQ 5760 

CAGGCAAAAC TATGGCTACT ACAGCAGATA CCCflGGCAGA AACATCGACT CTGAGAGGCC 5820 

CCGAGGCTAC CATCATCCCC AAGGATTCTT GGAGGACGAT GACTCGCCCG TTTGCTATGA 5880 

TTCACGGAGA TCTCCAAGGA GACGCCTACT ACCTCCCACC CCAGCATCCC ACCGGAGATC 5940 

CTCCTTCAAC TTTGAGTGCC TGCGCCGGCA GAGCAGCCAG GAAGAGGTCC CGTCGTCTCC 6000 

CATCTTCCCC CATOGCACGG OCCTGCCTCT GCATCTAATG CAGCAACAGA TCATGGCAGT 6060 

TGCCGGCCTA GATTCAAGTA AAGCCCAGAA GTACTCACCG AGTCACTCGA CCCGGTCGTG 6120 

GGCCACCCCT CCAGCAACCC CTCCCTACCG GGACTGGACA CCGTGCTACA CCCCOCTGAT 6180 

CCAAGTGGAG CAGTCAGAGG CCCTGGACCA GGTGAACGGC AGCCTG CCGT CCCTGCACCG 6240 

CAGCTCCTGG TACACAGACG AGCCCGACAT CTCCTACCGG ACTTTCACAC CAGCCAGCCT 6300 

GACTGTCCCC AGCAGCTTCC GGAACAAAAA CAGCGACAAG CAGAGGAGTG CGGACAGCTT 6360 

GGTGGAGGCA GTCCTGATAT CCGAAGGCTT GGGACGCTAT GCAAGGGACC CAAAATTTGT 6420 

GTCAGCAACA AAACACGAAA TCGCTGATGC CTGTGACCTC ACCATCGACG AGATGGAGAG 6480 

TGCAGCCAGC ACCCT G CT TA ATGGGAACGT GCGTCCCCGA GCCAACGGGG ATGTGGGCCC 6540 

CCTCTCACAC CGGCAGGACT ATGAGCTACA GGACTTTGGT CCTGGCTACA GCGACGAAGA 6600 

GCCAGACCCT GGGAGGGATG AGGAGGACCT GGOGGATGAA ATGATATGCA TCACCACCTT 6660 

GTAGCCCOCA GCGAGGGGCA GACTGGCTCT GGCCTCAGGT GGGGCGCAGG AGAGCCAGGG 6720 

GAAAAGTGCC TCATAGTTAG GAAAGTTTAG GCACTAGTTG GGAGTAATAT TCAATTAATT 6780 

AGACTTTOGT ATAAGAGATO TCATGCCTCA AGAAAGCCAT AAACCTGGTA GGAACAGGTC 6840 

CCAAGCGGTT GAGCCTGGCA GAGTACCATG CGCTCGGCCC CAGCTGCAGG AAACAGCAGG 6900 

CCCCGCCCTC TCACAGAGGA TGGGTGAGGA GGCCAGACCT GCCCTGCCCC ATTGTCCAGA 6960 

TGGGCACTGC TGTGGAGTCT GCTTCTCCCA TGTACCAGGG CACCAGGCCC ACCCAACTGA 7020 

AGGCATGGCG GCGGGGTGCA GGGGAAAGTT AAAGGTGATG ACGATCATCA CACCTCGTGT 7080 

CGTTACCTCA GCCATCGGTC TAGCATATCA GTCACTGGGC CCAACATATC CATTTTTAAA 7140 
CCCTTTOOOC CAAATACACT GCGTCCTGGT TCCTGTTTAG CTGTTCTGAA ATA 

BEQ P Hfr2B8 PFD2 Protein sequence: 
Protein Accession #: A38198 

1 11 21 31 41 51 

I I I I I I 

HMMMHHHKKM QHQRQQQADH ANBANYARGT RLPLSGBGPT SQPNSSKQTV LSWQAAIEAA 60 

RQAKAAQTMS TSAPPPVGSL SQRKRQQYAK SKKQGNSSNS RFARALFCLS LNNPIRRACI 120 

SIVEWKPFDI FILLAIFANC VALAIYIPPP EDDSNSTNHN LKKVEYAFLI IETVBTPLKI 180 

IAYGLLLHFN AYVRNGWNLL DFVTVIVGLF SVTLEQLTKE TEGGNHSSGK SGGFDVKALR 240 

AFRVLRPLRL VSGVPSLQW LNSIIKAMVP UaiALLVLF VIIIYAIIGL ELFIGXMHKT 300 

CFFADSDXVA EEDPAPCAPS GNGRQCTANG TECRSGWVGP NGGTTNFDNF AFAMLTVFQC 360 

ITMBGWTDVL YWVNDAIGWE WPWVYPVSLI ILGSFFVUIL VLGVLSGBPS KEREKAKARG 420 

DFQKLREKQO LEEDLRGYLD WITQAEDIDP ENEEEGGEEG KRNTSMPTSE TESVNTBNVS 460 

GEGENRGCCG SLWCWWRRRG AAKAGPSGCR RHGQAISKSK LSRRWRRHNR FNRRRCRAAV 540 

KSVTPYWLV1 VLVPLSTLTI SSEHYNQPDW LTQIQDIANK VLLALPTCEM LVK2CTSLGLQ 600 

AYFVSLFHRF DCFWCGGIT ETILVELBIM SPLGISVFRC VRLLRIFKVT RHWTSLSNLV 660 

ASLLNSMKSI ASLLLLLFLF IIIPSLLGMQ LFGGKFNFDE TQTKRSTFDN FPQALLTVPQ 720 

ILTGEDWNAV MYDGTMAYGG PSSSGMIVCI YFIILFICGN YILLNVFLAI AVDNLADAES 780 

LNTAQKEEAB EKBRKKIARK ESLENKKKNK PEVNQIAKSD NKVTXDDYRE EDEDKDPYPP 840 

CDVPVGEEEE EBEBDEPEVP AGPRFRRISE LNMKBKLAPI PEGSAFFILS KTOPIRVGCH 900 

KLINHHIFd LILVFtMLSS AALAAEDPIR SHSFRNTILG YFDYAFTAIF TVEILLKMTT 960 

FGAFLHKGAF CFNYFNLUDM LWGVSLVSF GXQSSAISW KILHVLRVLR PLRAIH RAKQ 1020 

LRBWQCVFV AIRTXGNIKX VTTLLQFMFA CIGVQLFKGK FYRCTDBAKS NPEECRGLFI 1080 

LYKDGDVDSP WRERIWQNS DFNFDNVLSA MMALFTVSTF EGWFALLYKA XDSNGERXGP 1140 

IYHHRVEISI PFTIY1IIVA FFHMNIFVGF VTVTFQEQGE KBYKKCELDK NQRQCVEYAL 1200 

KARPLRRYIP KNPYQYKFW* WNSSPFEVM MFVLIHLNTL CLAMGHYEQS KMFKDAMDIL 1260 

NMVPTGVFTV EKVLKVIAFK PKGYFSDAWN TPDSLIVIGS IIDVALSKAD PTKSBNVPVP 1320 
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TATPGNSEES NRISITFFRI* FRVMRLVKLL SRGEQIRTLL WTFIKSPQAL FYVALLIAML 13 BO 

FPIYAVIGMQ MFGKVAMRDN NQINRNNNFQ TFPQAVLLLP RCATGEAWQE IMLACLPGKL 1440 

CDPESDYNFG EEYTCGSNFA IVYPISPYML CAFLIINLFV AVIHDNFDYL TRDWSILGPH 1500 

HLDEPKRIWS EYDPEAKGRI KHLDWTLLR RIQPPU2FGK LCFHRVACKR LVAKNMPLNS 1560 

DGTVHFNATL FALVRTALKI KTEGNLEQAN EELRAV1KKI WKKTSHKLLD QWPPAGDDB 1620 

VTVGXFYATF LIQDYFRKFK KRKEQGLVGK YPAKNTTIAL QAGLRTLHDI GPEIRRAISC 1680 

DLQDDEPEET KREEEDDVFK RNGALLGNHV NHVNSDRRDS LQQTNTTHRP LHVQRPSIPP 1740 

ASDTEKPLFP FAGNSVCENH HNHNSIGKQV PTSTNANLNN ANHSKAAHGK RPSIGNLEHV 1800 

SENGHHS5HK HDREPQRRSS VKRTRYYETY IRSDSGDEQL PTICREDPBI HGYFRDPHCL 1860 

GEQEYFSSEE CYEDDSSPTW SRQNYGYYSR YPGRNIDSER PRGYHHPQGP LEDDDSPVCY 1920 

DSRRSPRRRL LPPTPASHRR SSFNPECLRR QSSQEEVPSS PIPPHRTALP LHLMQQQIMA 1980 

VAGLDSSKAQ KYSPSHSTRS KATPPATPPY RDWTPCYTPL IQVEQSEALD QVNGSLPSLH 2040 

RSSWYTDEPD ISYRTFTPAS LTVPSSFRNK NSDKQRSADS LVEAVLISEG LGRYARDPKP 2100 

VSATKHEIAD ACDLTZDEHE SAASTLLNGN VRFRANGDVC PLSHRQDYEL QDFGPGYSDE 2160 
EPDPGRDEED LADEMICITT L 

SEO ID 110:289 OBIS DNA SEQUENCE 

Nucleic Acid Accession*: NM.002B12 

Coding sequence: 150-3382 (underlined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

I I i I I I 

AACTCCCGCC TCGGGACGCC TCGGGGTCGG GCTCCGGCTG CGGCTGCTGC TGCGGCGCCC 60 

GCGCTCCGGT GOGTCCGCCT CCTGTGCCC6 CCGCGGAGCA GTCTGCGGCC CGCCGTGCGC 120 

CCTCAGCTCC TTTTCCTGAG CCCGOCGO GA TOG GAGCTGC GCGOGGATCC CCGGCCAGAC 160 

CCCGCCGGTT GCOVCTGCTC AGCGTCCTGC TGCTGCCGCT GCTGGGCGGT ACCCAGACAG 240 

CCATTGTCTT CATCAAGCAG OCQTCCTCCC AGGATGCACT 6CAGGGGCGC CGGGCGCTGC 300 

TTCGCTGTGA GGTTGAGGCT CCGGGCOCGG TACATGTQTA CTGGCTGCTC GATGGGGCCC 360 

CTGTCCAGGA CACGGAGCGG CGTTTCGCCC AGGGCAGCAG CCTGAGCTTT GCAGCTGTGQ 420 

ACCGGCTGCA GGACTCTGGC ACCTTCCAQT GTGTGGCTCG GGATGATQTC ACTGGAGAAG 480 

AAGCCCGCAG TGCCAACGCC TCCTTCAACA TCAAATGGAT TGAGGCAGGT CCTGTGGTCC 540 

TGAAGCATCC AGCCTCGGAA GCTGAGATCC AGCCACAGAC CCAGGTCACA CTTCGTTGCC 600 

ACATTGATGG GCACOCTCGG CCCACCTACC AATGGTTCOG AGATGGGACC CCCCTTTCTG 660 

ATGGTCAGAQ CAACCACACA GTCAGCAGCA AGGA6CGGAA CCTGACGCTC CGGCCAGCTG 720 

GTCCTGAGCA TAGTGGGCT6 TATTCCTGCT GCGCCCACAG TGCTTTTGGC CAGGCTTGCA 780 

GCAGCCAGAA CTTCACCTTG AGCATTGCTG ATGAAAGCTT TGCCAGGGTG GTGCTGGCAC 840 

CCCAGGACGT GGTAGTAGCO AOGTATGAGG AGGCCATOTT CCATTGCCAG TTCTCAGCCC 900 

AGCCACCCCC GAGCCTGCAG TGGCTCTTTG AGGATGAGAC TCCCATCACT AACCGCAGTC 960 

GCCCCCCACA CCTCCGCAGA GCCACAGTGT TTGCCAACGG GTCTCTGCTG CTGACCCAGG 1020 

TCCGGCCAC6 CAATGCAGGG ATCTACCGCT GCATTGGCCA GGGGCAGAGG GGCCCACCCA 1080 

TCATCCTGGA AGCCACACTT CAOCTAGCAG AGATTGAAGA CATGCCGCTA TTTGAGCCAC 1140 

GGGTGTTTAC AGCTGGCAGC GAGGAGCGTG TGACCTGCCT TCCCCCCAAG GGTCTGCCAG 1200 

AGCCCAGCGT GTGGTGGGAG CACGCGGGAG TCCGGCTGCC CACCCATGGC AGGGTCTACC 1260 

AGAAGGGCCA OGAGCTGGTG TTGGCCAATA TTGCTGAAAG TGATGCTGGT GTCTACACCT 1320 

GCCACGC6GC CAACCTGGCT GGTCAGCGGA GACAGGATGT CAACATCACT 0T8GOCACTO 1380 

TGCGCTCCT6 GCTGAAGAAG CCCCAAGACA GCCAGCTGGA GGAGGGCAAA CCCGGCTACT 1440 

TGGATTGCCT GACCCAGGCC ACACCAAAAC CTACAGTTGT CTGGTACAGA AACCAGATGC 1500 

TCATCTCAGA GGACTCACGG TTCGA QGTCT TCAAGAATGG GACCTTGCGC ATCAACAGCG 1560 

TGGAGGTGTA TGATGGGACA TGGTACCGTT GTATGAGCAG CACCCCAGCC GGCAGCATCG 1620 

AGGCGCAAGC CCGTGTCCAA GTGCTGGAAA AGCTCAAGTT CACACCACCA CCCCAGCCAC 1680 

AGCAGTGCAT GGAGTTTGAC AAGGAGGCCA CGGTGCCCTG TTCAGCCACA GGCGGAGAGA 1740 

AGCCCACTAT TAAGTGGGAA CGGGCAGATG GGAGCAGCCT CCCAGAGTGG GTGACAGACA 1800 

ACGCTGGGAC CCTGCATTTT GCCCGGGTGA CTCGAGATGA CGCTGGCAAC TACACTTGCA 1860 

TTGCCTCCAA CGGGCCGCAG GGCCAGATTC GTGCCCATGT CCAGCTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAGTGGAA CCAGAGCGTA CGACTGTGTA CCAGGGCCAC ACAGCCCTAC 1980 

TGCAGTGCGA GGCCCAGGGG GACOCCAAGC CGCTGATTCA GTGGAAAGGC AAGGAOCGCA 2040 

TCCTGGACCC CAOCAAGCTG GGACOCAGGA TGCACATCTT CCAGAATGGC TCCCTGGTGA 2100 

TCCATGACGT GGCCCCTGAG GACTCAGGCC GCTACACCTG CATTGCAGGC AACAGCTGCA 2160 

ACATCAAGCA CACGGAGGCC CCCCTCTATG TCGTGGACAA GCCTGTGCCG GAGGAGTCGG 2220 

AGGGCCCTGG CAGCCCTCCC CCCTACAAGA TGATCCAGAC CATTGGGTTG TCGGTGGGTG 2280 

CCGCTOTCG C CTACATCATT GCCG T GCTSQ GCCTCATGTT CTACTGCAAG AAGCGCTGCA 2340 

AAGCCAAGCG GCTGCAGAAG CAGCCCGAGG GCGAGGAGCC AGAGATGGAA TGCCTCAACG 2400 

GAGGGCCTTT GCAGAACGGG CAGCCCTCAG CAGAGATCCA AGAAGAAGTG GCCTTGACCA 2460 

GCTTGGGCTC CGGCOOCGCG GCCACCAACA AACGCCACAG CACAAGTGAT AAGATGCACT 2520 

TCCCACGGTC TAGCCTGCAG CCCATCACCA CGCTGGGQAA GAGTGAGTTT GGGGAGGTGT 2580 

TCCTGGCAAA GGCTCAGGGC TTGGAGGAGG GAGTGGCAGA GACCCTGGT A CTTGTGAAGA 2640 

GCCTGCAGAC GAAGGATGAG CAGCAGCAGC TGGACTTCCG GAGGGAGTTG GAGATGTTTG 2700 

GGAAQCTGAA CCACGCCAAC GTGGTGCGGC TCCTGGGGCT GTGOCGGGAG GCTGAGCCCC 2760 

ACTACATGGT GCTGGAATAT GTGGATCTGG GAGACCTCAA GCAGTTCCTG AGGATTTCCA 2820 

AGAGCAAGGA TGAAAAATTG AAGTCACAGC CCCTCAGCAC CAAGCAGAAG GTGGCCCTAT 2880 

GCACCCAGGT AGCCCTGGGC ATGGAGCACC TGTCCAACAA CC GCTTT Q TO CATAAGGACT 2940 

TGGCTGCGOG TAACTGCCTG GTCAGTGCCC AGAGACAAGT GAAGGTGTCT GCCCTGGGCC 3000 

TCAGCAAGGA TGTGTACAAC AGTGACTACT ACCACTTCCG CGAGGCCTGG GTGCCGCTGC 3060 

GCTGGATGTC COCOGAGGCC ATCCTGGAGG OTGACTTCTC TACCAAGTCT GATGTCTGGG 3120 

C C T TCQQTOT GCTGATGTGG GAAGTGTTTA CACATGGAGA GATGCCCCAT GGTGGGCAGG 3180 

CAGATGATGA AGTACTGGCA GATTTGCAGG CTGGGAAGGC TAGACTTCCT CAGCCCGAGG 3240 

GCTGCCCTTC CAAACTCTAT CGGCPGATGC AGCGCTGCTG GGCCCTCAGC CCCAAGGACC 3300 

GGCCCTCCTT CAGTGAGATT GCCAGCGCCC TG GGAQA CAG CACCQTGQAC AGCA AGCCGT 3360 

GAGGAGGGAG CCCGCTCAGG ATGGCCTGGG CAGGGGAGGA CATCTCTAGA GGGAAGCTCA 3420 
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CAGCATGATG GGCAAGATCC CTQTCCTCCT GGGCCCTGAG GTGCCCTAGT GCAACAGGCA 3480 

TTGCTG AGGT CTGAGCAGGG CCTGGCCTTT CCTCCTCTTC CTCACCCTCA TCCTTTGGGA 3540 

GGCTGACTTO GACOCAAACT GGGCGACTAO GGCTTTGAGC TGGGCAGTTT CCCCTGCCAC 3600 

CTCTTCCTCT ATCAGGGACA GTGTGGGTGC CACAGGTAAC CCCAATTTCT GGCCTTCAAC 3660 

TTC TCOC CTT OACCGGGTOC AACTCTGCCA CTCATCTGCC AACTTTGCCT GGGGAGGGCT 3720 

AGGCTTGGGA TGAGCTGGGT TTGTGGGGAG TTCCTTAATA TTCTCAAGTT CTGGSCACAC 3780 

AGCGTTAATQ AGTCTCTTGC CCACTGGTCC ACTTG666GT CTAGACCAGG ATTATAGAGG 3840 

ACACAGCAAG TGAGTCCTCC CCACTCTGGG CTTGT6CACA CTGACCCAGA CCCACGTCTT 3900 

ccccA ocerr ctc t cc tt tc ctcatcctaa gtgcctggca gatgaaggag ttttcaggag 3960 

CTTTTGACAC TATATAAACC GCCCTTTTTG TATGCACCAC GGGCGGCTTT TATATGTAAT 4020 

TGCAGCGTGG GGTGGGTGGG CATGGGAGGT AGGGGTGGGC CCTGGAGATG AGGAGGGTGG 4080 

GCCATCCTTA CCCCACACTT TTATTCTTGT CGTTTTTTGT TTGTTTTGTT TTTTTGTTTT 4140 
TGTTTTTGTT TTTACACTCG CTGCTCTCAA TAAATAAGCC TTTTTTA 



SEP tD NttflW OBI6 Proteto secuenca 
Protein Accession #: NP.002812 

1 11 21 31 41 51 

I I I I I I 

MGAARGSPAR PRRLPLLSVL LLPLLGGTQT AIVFIKQPSS QDALQGRRAL LRCEVEAPGP 60 

VHVYWLLDGA FVQDTERRFA QGSSLSFAAV DRLQDSGTFQ CVARDDVTGE EARSAHASFN 120 

IKWIEAGPW LKHPASEAEI QPQTQVTLRC HIDGHPRPTY QWFRDGTPLS DGQSNHTVSS 180 

KERNLTLRPA GPEHSGLYSC CAHSAFGQAC SSQNPTLSIA DSSFARWLA PQDWVARYE 240 

EAMFHCQPSA QPPPSLQWLP EDETPITNRS RPPHLRRATV FANGSLLLTQ VRPRNAGIYR 300 

CIGQGQRGFP IILEATLHLA EIEDMPLPEP RVFTAGSEER VTCLPPKGLP EPSVWWKHAG 360 

VRLPTHGRVY QKGHELVLAH IAESDAGVYT CHAANLAGQR RQDVNITVAT VPSWLKKPQD 420 

SQLEBGKPGY LDCLTQATPK FTWWYRNQM LISEDSRFBV FKNGTLRINS VEVYDGTWYR 480 

CHSSTPAGSI EAQARVQVLE KLKFTPPPQP QQCMEPDKEA TVPCSATGRE KPTIKWERAD 540 

GSSLPEWVTD NAGTLHFARV TRDDAGNYTC IASNGPCGQI RAHVQLTVAV FITFKVEPER 600 

TTVYQGHTAL LQCEAQGDPK PLIQWKGKDR ILDPTKLGPR MHIPQNGSLV IHDVAPBDSG 660 

RYTCIAGNSC NIKHTEAPLY WDK PVPEBS EGPGSPPPYK MIQTIGLSVG AAVAYIIAVL 720 

GLMPYCKKRC KAKRLQKQPE GEBPEHBCLN GGPLQNGQPS AEIQEBVALT SLGSGPAA3M 780 

KRHSTSDKMH FPRSSLQPIT TLGKSEFGEV PXARAQGLEE GVAETLVLVK SLQTKDEQQQ 840 

LDPRRBLEMP GKLNHANWR LLGLCREAEP HVKVLBYVDL GDLRQFLRIS KSB33EKLKSQ 900 

PLSTKQKVAL CTQVALGMEH LSNNRFVHKD LAARNCLVSA QROVKVSALG LSKDVYNSFY 960 

YHFRQAWVPL RWMSPEAILB GDFSTKSDVW AFGVLMWEVP THGEMPHGGQ ADDEVLADLQ 1020 
AGKARLPQPE GCPSKLYRLM QRCWALSPKD RPSPSEIASA LGDSTVDSKP 



8EQ © NO:291 AAB1 DNA SEQUENCE 

Nuclefc Acid Accession* NM_002205 

Coding sequence 1-3160 {undefined sequences wrrespond to start and stop codons) 



1 U 21 31 41 51 

i I I I I I 

ATGGGGAGCC GGACGCCAGA GTCCCCTCTC CACGCCGTGC AGCTGCGCTG GGGCCCGCGG 60 

CGCCGACCCC CGCTSSTGCC GCTGCTGTTG CTGCTSSTGC CGCOGOCACC CAGGGTCGGG 120 

GGCTTCAACT TAGACGCGGA GGCCOCAOCA GTACTCTCGG GGCCCCCGGG CTCCTTCTTC 180 

GGATTCTCAG TGGAGTTTTA CCGGCCGGGA ACAGAOGGGG TCAGTGTGCT GGTGGGAGCA 240 

CCCAAGGCTA ATACCAGCCA GCCAGGAGTG CTGCAGGGTG GTGCTGTCTA CCTCTGTCCT 300 

TGGGGTGCCA GCCCCACACA GTGCACCCCC ATTGAATTTG ACAGCAAAGG CTCTCGGCTC 360 

CTGGAGTCCT CACTGTCCAG CTCAGAGGGA GAGGAGCCTG TGGAGTACAA GTCCTTGCAG 420 

TGGTTCGGGG CAACAGTTCG AGCCCATGGC TCCTCCATCT TGGCA3GCGC TCCACTGTAC 480 

AGCTGGCGCA CAGAGAAGGA GCCACTCAGC GACCCCGTGG GCACCTGCTA CCTCTCCACA 540 

GATAACTTCA CCCGAATTCT GGAGTATGCA CCCTGCCGCT CAGA3TTCAG CTGGGCAGCA 600 

GGACAGGGTT ACTGCCAAGG AGGCTTCAGT GCCGAGTTCA CCAAGACTGG CCGTGTGGTT 660 

TTAGGTGGAC CAGGAAGCTA TTTCTGGCAA GGCCAGATCC T GT CTGC C AC TCAGGAGCAG 720 

ATTGCAGAAT CTTATTACCC CGAGTACCTG ATCAACCTGG TTCAGGGGCA GCTGCAGACT 780 

CGCCAGGCGA GTTCCATCTA TGATGACAGC TACCTAGGAT ACTCTGTGGC TGTTGGTGAA 840 

TTCAGTGGTG ATGACACAGA AGACTTTGTT GCTGGTGTGC CCAAAGGGAA CCTCACTTAC 900 

GGCTATGTCA CCATCCTTAA TGGCTCAGAC ATTCGATCCC TCTACAACTT CTCAGGGGAA 960 

CAGATGGCCT CCTACTTTGG CTATGCAGTG GCCGCCACAG ACGTCAATGG GGACGGGCTG 1020 

GATGACTTGC TGGTGGGGGC ACCCCTGCTC ATGGATCGGA CCCCTGACGG GCGGCCTCAO 1080 

GAGGTGGGCA GGGTCTACGT CTACCTGCAG CACCCAGCCG GCATAGAGCC CACGCCCACC 1140 

CTTACCCTCA CTGGCCATGA TGAGTTTGGC CGATTTGGCA GCTCCTTGAC CCCCCTCGGG 1200 

GACCTGGACC AGGATGGCTA CAATGATGTG GCCATCGGGG CTCCCTTTGG TGGGGAGAGC 1260 

CAGCAGGGAG TAGTGTTTGT ATTTCCTGGG GGCCCAGGAG GGCTGGGCTC TAAGCCTTCC 1320 

CAGGTTCTGC AGCCCCTGTG GGCAGCCAGC CACACCCCAG ACTTCTTTGG CTCTGCCCTT 1380 

CGAGGAGGCC GAGACCTGGA TGGCAATGGA TATCCTGATC TGATTGTGGG GTCCTTTGGT 1440 

GTGGACAAGG CTGTGGTATA CAGGGGCCGC CCCATCGTGT COGCTAGTQC CTCCCTCACC 1500 

ATCTTCCCCG CCATGTTCAA CCCAGAGGAG CGGAGCTGCA GCTTAGAGGG GAACCCTGTG 1560 

GCCTGCATCA ACCTTAGCTT CTGCCTCAAT GCTTCTGGAA AACACGTTGC TGAC7CCATT 1620 

GGTTTCACAG TGGAACTTCA GCTGGACTGG CAGAAGCAGA AGGGAGGGGT ACGGCGGGCA 1680 

CTGTTCCTGG CCTCCAGGCA GGCAACCCTG ACCCAGACCC TGCTCATCCA GAATGGGGCT 1740 

OGAGAGGATT GCAGAGAGAT GAAGATCTAC CTCAGGAACG AGTCAGAATT TCGAGACAAA 1800 

CTCTOGCCGA TTCACATCGC TCTCAACTTC TCCTTGGACC CCCAAGCCCC AGTGGACAGC 1860 

CACGGCCTCA GGCCAGCCCT ACATTATCAG AGCAAGAGCC GGATAGAGGA CAAGGCTCAG 1920 

ATCTTGCTGG ACTGTGGAGA AGACAACATC TGTGTGCCTG AOCTGCAGCT GGAAGTGTTT 1980 
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GGGGAGCAGA ACCATGTGTA CCTGGGTGAC AAGAATGCCC TGAACCTCAC TTTCCATOCC 2040 

CAGAATGTGG GT6AGG6T6G CGCCTATGAG GCTGAGCTTC GGGTCACCGC CCCTCCAGAO 2100 

GCTGAGTACT CAGGACTCGT CAGACACCCA GGGAACTTCT CCAGCCTGAG CTGTGACTAC 2160 

TTTGCCGTGA ACCAGAGCCG CCTGCTGGTG TGTGACCTGG GCAACCCCAT GAAGGCAGGA 2220 

GCCAQTCTGT GGGGTGGCCT TCGGTTTACA GTCCCTCATC TCCGGGACAC TAAGAAAACC 2280 

ATCCAGTTTG ACTTCCAGAT CCTCAGCAAG AATCTCAACA ACTCGCAAAG CGACGTGGTT 2340 

TCCTTTCGGC TCTCCGTGGA GGCTCAGGCC CAGGTCACCC TGAACGGTGT CTCCAAGCCT 2400 

GAGGCAGTGC TATTCCCAOT AAGCGACTGG CATCCCCGAG ACCAGCCTCA GAAGGAGGAG 2460 

GACCTGGGAC CTGCTGTCCA CCATGTCTAT GAGCTCATCA ACCAAGGCCC CAGCTCCATT 2520 

AGCCAGGGTG TGCTGGAACT CAGCTGTCCC CAGGCTCTGG AAGGTCAGCA GCTCCTATAT 2580 

GTGACCAGAG TTACGGGACT CAACTGCACC ACCAATCACC CCATTAACCC AAAGGGCCTG 2640 

GAGTTGGATC CCGAGGGTTC CCTGCACCAC CAGCAAAAAC GGGAAGCTCC AAGCCGCAGC 2700 

TCTGCTTCCT CGGGACCTCA GATCCTGAAA TGCCCGGAGG CTGAGTGTTT CAGGCTGCGC 2760 

TGTGAGCTCG GGCCCCTGCA CCAACAAGAG AGCCAAAGTC TGCAGTTGCA TTTCCGAGTC 2820 

TGGGCCAAGA CTTTCTTGCA GCGGGAGCAC CAGCCATTTA GCCTGCAGTG TGAGGCTGTG 2860 

TACAAAGCCC TGAAGATGCC CTACCGAATC CTGCCTCGGC AGCTGCCCCA AAAAGAGCGT 2940 

CAGGTGGCCA CAGCTGTGCA ATGGACCAAG GCAGAAGGCA GCTATGGCGT CCCACTGTGG 3000 

ATCATCATCC TAGCCATCCT GTTTGGCCTC CTGCTCCTAQ GTCTACTCAT CTACATCCTC 3060 

TACAAGCTTG GATTCTTCAA ACGCTCCCTC CCATATGGCA CCGCCATGGA AAAAGCTCAG 3120 
CTCAAGCCTC CAGOCACCTC TGATGCCTGA 



Protein Accession*: NP.002186 

1 11 21 31 41 51 

I I I I I I 

MGSRTFESPL HAVQLRWGPR RRPPLLPLLL LLLPPPPHVG GFNLDAEAPA VLSGPPGSFF 60 

GFSVEFYRFG TDGVSVLVGA PKANTSQPGV LQGGAVYLCP WGASPTQCTP XBFDSKGSRL 120 

LBSSLSSSEG EEPVEYKSLQ WFGATVRAHG SSILACAPLY SWRTEKBPLS DPVGTCYLST 180 

DNFTRILEYA PCRSDFSKAA GQGYCQGGFS AEFTKTGRW LGGPGSYFWQ GQILSATQEQ 240 

IAESYYPEYL XNLVQGQLQT RQASSIYDDS YLGYSVAVGB PSGDDTBDPV AGVPKGNLTY 300 

GYVTILNGSD IRSLYNFSGE QMASYFGYAV AATDVNGDGL DDLLVGAPLL MDRTPDGRPQ 360 

EVGKVYVYLQ HPAGIEPTPT LTLTGHDBFO RFGSSLTPLG OLDQDGYNDV AIGAPFGGET 420 

QQGWFVFPG GPGGLGSKPS QVLQPLWAAS HTPDFFGSAL RGGRDLDGNG YPDLIVGSPG 480 

VDKAWYRGR PIVSASASLT IPPAMFNPEB RSCSLBGNPV ACINLSPCLH ASGREVADSZ 540 

GPTVELQLDW QKQKGGVRRA LFLASRQATL TQTLLIQNGA REDCREMKIY LRNBSEFRDK 600 

LSPIEXALN? SLDPQAPVDS HGLRPALHYQ SKSRIEDKAQ ILLDCGEDNI CVPDLQLBVP 660 

GEQNHVYLGD KHALNLTFHA QNVGEGGAYE AELRVTAPPB AEYSGLVRHP GNFSSLSCDY 720 

PAVNQSRLLV CDLGNPHKAG ASLWGGLRFT VPHLRDTKKT IQPDFQILSK KLNNSQSDW 780 

SFRItSVEAQA QVTLNGVSKP EAVLFPVSDW HPRDQPQKEE DLGPAVHHVY ELINQGPSSI 840 

SQGVLELSCP QALEGQQLLY VTRVTCLNCT TNHPINPKGL ELDPEGSLHH QQKREAPSRS 900 

SASSGPQILK CPEAECFRLR CELGPLHQQB SQSLQLHFKV HAKTFLQREH QPPSLQCEAV 960 

YKALKMPYRI LPRQLPQKER QVATAVQWTK AEGSYGVPLW IIILAILPGL LLLGLLIYIL 1020 
YKLGFFKRSL PYGTAMEKAQ LKPPATSDA 

SEQ ID N0393 LBH4 DNA SEQUENCE 

Nudete Add Accession fr. BC001291 

Coding sequence: 44-541 (stmt and stop codons are undefined) 



1 11 21 31 41 51 
I I I I I I 

GGGGGCGCCG CGCGCTGACC CTCCCTGGGC AOCGCTGGGG AOGATGGCGC TOCTCGCCIT 60 
GCTGCTGGTC GTGGCOCTAC CGCGGOTGTO GACAGAOGCC AAOCTG ACTG CG AGACAACG 120 
AGATCCAGAG GACTCCCAGC GAACGGAOG A GGGTGACAAT AGAGTGTGGT GTCATGTTTG 180 
TGAGAG AGAA AACACTTTCG AGTGCCAG AA CCCAAGGAGG TGCAAATGGA CAGAGCCATA 240 
CTGCGTTATA GCGGCCGTGA AAATATTTCC AC O ' l ' l 11T1C ATGGTTGOGA AGCAGTGCTC 300 
CGCTGGTTGT GCAGCGATGG AGAGACCCAA GOCAGAGGAG AAGCGGTTTC TCCTGGAAGA 360 
GCCCATGCCC TTCTTTTAOC TCAAGTGTTO TAAAATTOGC TACTGCAATT TAGAGGGGCC 420 
AOCTATCAAC TCATCAGTGT TCAAAG AAT A TGCTGGGAGC ATGGGTG AG A GCTGTCGTGG 480 
GCTGTGGCTG GCCATCCTCC TGCTGCTGGC CTCCATTGCA GCXXJGCCTCA GOCTGTCTIQ 540 
AGCCACGGGA CTGCCACAGA CTG AGOCT TC CGGAGCAT GQ ACTCGCTCCA GAOOGTTGTC 600 
ACCTGTTGCA TTAAACTTGT TTTCTGTTG A TTACCTCTTG GTTTGACTTC CCAGGGTCTT 660 
GGGATGGG AG AGTGGGGATC AGG1GCAGTT GGCTCTTAAC CCTCAAOGGT TCTTTAACTC 720 
ACATTCAGAG GAAGTCCAGA TCTCCTGAGT AGTGATTTTG GTGACAAGTT TTTCTCTTTG 780 
AAATCAAACC TTGT AACTCA TTT ATTGCTG ATGGOCACTC TTTTOCTTGA CTCCCCTCTG 840 
CCTCTGAGGG CTTCAGTATT GATGGGGAGG GAGGCCTAAG TACCACTCAT GGAGAGTATG 900 
TGCTGAG ATG CTTCOG ACCT TTCAGGTGAC GCAGGAACAC TGGGGG AGTC TGAATG ATTG 960 
GGGTOAAGAC ATCCCTGGAG TGAAGGACTC CTCAGCATGO GGGGCAGTGG GGCACACGTT 1020 
AOOOCTOOOC CCATTOCAGT GGTGGAGGOG CTGTGGATGG C1GC1 1 1 IC C TCAAOCTTTC 1080 
CTACCAGATT CCAGG AGGCA GAAGATAACT AATTGTGTTG AAGAAACTTA GACTTCACCC 1140 
ACCAGCTGGC ACAGGTGCAC AGATTCATAA ATTCCCACAC GTGTGTGTTC AACATCTGAA 1200 
ACTTAGGCCA AGTAG AGAGC ATCAGGGTAA ATGGCGTTCA TTTCTCTGTT AAGATGCAGC 1260 
CATCCATGGG GAGCTGAGAA ATCAGACTCA AAGTTCCACC AAAAACAAAT ACAAGGGGAC 1320 
TTCAAAAGTTCACGAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 
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SEQ IP NftB4 LPH4 Pret^n wwfflgg 
ProUn Accession t: AAH01291 



1 11 21 31 41 51 
(11(11 

MALLALLLW ALPRVWTDAN LTARQRDPED SQRTDEGDNR VWCHVCEREN TFECQNTPRRC 60 
KWTEPYCVIA AVKIFPRFFM VAKQCSAGCA AMERPKPEEK RFLLEEPMPF FVLKCCIOR Y 120 
CNLEGPPINS SVFKEYAGSM GESCGGLWLA ILLLLASIAA GLSLS 



It is understood that the examples described above in no way serve to limit the 
true scope of this invention, but rather are presented for illustrative purposes. All 
publications, sequences of accession numbers, and patent applications cited in this 
specification are herein incorporated by reference as if each individual publication or patent 
application were specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMED IS : 

1 LA method of detecting a prostate cancer-associated transcript in a cell 

2 from a patient, the method comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

4 as shown in Tables 1-16. 

1 2. Hie method of claim 1, wherein die polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1-16. 

1 3. The method of claim 1, wherein the biological sample is a tissue 

2 sample. 

1 4. The method of claim 1, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 5. The method of claim 4, wherein the nucleic acids are mRNA. 

1 6. The method of claim 4, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 7. The method of claim 1, wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-16. 

1 8. The method of claim 1 , wherein the polynucleotide is labeled 

1 9. The method of claim 8, wherein the label is a fluorescent label. 

1 10. The method of claim 1, wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 11. The method of claim 1, wherein the patient is undergoing a therapeutic 

2 regimen to treat prostate cancer. 

1 12. The method of claim 1, wherein the patient is suspected of having 

2 prostate cancer. 
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1 13. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated transcript in the 

6 biological sample by contacting the biological sample with a polynucleotide that selectively 

7 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16, 

8 thereby monitoring the efficacy of the therapy. 

1 14. The method of claim 13, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated transcript to a level of theprostate cancer- 

3 associated transcript in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment 

1 15. The method of claim 13, wherein the patient is a human. 

1 16. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated antibody in the 

6 biological sample by contacting the biological sample with a polypeptide encoded by a 

7 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

8 as shown in Tables 1-16, wherein the polypeptide specifically binds to the prostate cancer- 

9 associated antibody, thereby monitoring the efficacy of the therapy. 

1 17. The method of claim 16, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated antibody to a level of the prostate cancer- 

3 associated antibody in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment 

1 18. The method of claim 16, wherein the patient is a human. 
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1 19. A method of monitoring the efficacy of a therapeutic treatment of 

2 prostate cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a prostate cancer-associated polypeptide in the 

6 biological sample by contacting the biological sample with an antibody, wherein the antibody 

7 specifically binds to a polypeptide encoded by a polynucleotide that selectively hybridizes to 

8 a sequence at least 80% identical to a sequence as shown in Tables 1-16, thereby monitoring 

9 the efficacy of the therapy. 

1 20. The method of claim 19, further comprising the step of: (iii) comparing 

2 the level of the prostate cancer-associated polypeptide to a level of the prostate cancer- 

3 associated polypeptide in a biological sample from the patient prior to, or earlier in, the 

4 therapeutic treatment. 

1 21 . The method of claim 19, wherein the patient is a human. 

1 22. An isolated nucleic acid molecule consisting of a polynucleotide 

2 sequence as shown in Tables 1-16. 

1 23. The nucleic acid molecule of claim 22, which is labeled. 

1 24. The nucleic acid of claim 23, wherein the label is a fluorescent label 

1 25. An expression vector comprising the nucleic acid of claim 22. 

1 26. A host cell comprising the expression vector of claim 25. 

1 27. An isolated polypeptide which is encoded by a nucleic acid molecule 

2 having polynucleotide sequence as shown in Tables 1-16. 

1 28. An antibody that specifically binds a polypeptide of claim 27. 

1 29. The antibody of claim 28, further conjugated to an effector component 
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1 30. The antibody of claim 29, wherein the effector component is a 

2 fluorescent label. 

1 31. The antibody of claim 29, wherein the effector component is a 

2 radioisotope or a cytotoxic chemical. 

1 32. The antibody of claim 29, which is an antibody fragment. 

1 33. The antibody of claim 29, which is a humanized antibody 

1 34. A method of detecting a prostate cancer cell in a biological sample 

2 from a patient, the method comprising contacting the biological sample with an antibody of 

3 claim 28. 

1 35. The method of claim 34, wherein the antibody is further conjugated to 

2 an effector component. 

1 36. The method of claim 35, wherein the effector component is a 

2 fluorescent label. 

1 37. A method of detecting antibodies specific to prostate cancer in a 

2 patient, the method comprising contacting a biological sample from the patient with a 

3 polypeptide encoded by a nucleic acid comprises a sequence from Tables 1-16. 

1 38. A method for identifying a compound that modulates a prostate cancer- 

2 associated polypeptide, the method comprising the steps of: 

3 (i) contacting the compound with a prostate cancer-associated polypeptide, the 

4 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 

5 80% identical to a sequence as shown in Tables 1-16; and 

6 (ii) determining the functional effect of the compound upon the polypeptide. 

1 39. The method of claim 38, wherein the functional effect is a physical 

2 effect 



431 



WO 02/30268 



PCT7US01/32045 



1 40. The method of claim 38, wherein the functional effect is a chemical 

2 effect. 

1 41 . The method of claim 38, wherein the polypeptide is expressed in a 

2 eukaryotic host cell or cell membrane. 

1 42. The method of claim 38, wherein the functional effect is determined by 

2 measuring ligand binding to the polypeptide. 

1 43. The method of claim 38, wherein the polypeptide is recombinant. 

1 44. A method of inhibiting proliferation of a prostate cancer-associated 

2 cell to treat prostate cancer in a patient, the method comprising the step of administering to 

3 the subject a therapeutically effective amount of a compound identified using the method of 

4 claim 38. 

1 45. The method of claim 44, wherein the compound is an antibody. 

1 46. The method of claim 45, wherein the patient is a human. 

1 47. A drug screening assay comprising the steps of 

2 (i) administering a test compound to a mammal having prostate cancer or a 

3 cell isolated therefrom; 

4 (ii) comparing the level of gene expression of a polynucleotide that selectively 

5 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-16 in a 

6 treated cell or mammal with the level of gene expression of the polynucleotide in a control 

7 cell or mammal, wherein a test compound that modulates the level of expression of the 

8 polynucleotide is a candidate for the treatment of prostate cancer. 

1 48. The assay of claim 47, wherein the control is a mammal with prostate 

2 cancer or a cell therefrom that has not been treated with the test compound. 

1 49. The assay of claim 47, wherein the control is a normal cell or mammal. 
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1 50. A method for treating a mammal having prostate cancer comprising 

2 administering a compound identified by the assay of claim 47. 

1 SLA pharmaceutical composition for treating a mammal having prostate 

2 cancer, the composition comprising a compound identified by the assay of claim 47 and a 

3 physiologically acceptable excipient 

1 52. The method according to claim l,wherein said biological sample is 

2 contacted with a plurality of polynucleotides comprising a first polynucleotide that 

3 selectively hybridizes to a sequence at least 80% identical to a first sequence as shown in 

4 Tables 1-16; and a second polynucleotide that selectively hybridizes to a second sequence at 

5 least 80% identical to a second sequence as shown in Tables 1-16. 

1 53. A method according to claim 52, wherein the plurality of 

2 polynucleotides comprises a third polynucleotide that selectively hybridizes to a sequence at 

3 least 80% identical to a third sequence as shown in Tables 1-16.. 

1 54. A method of detecting a prostate cancer associated transcript, the 

2 method comprising contacting a biological sample from the patient with a plurality of 

3 polynucleotides wherein at least two of said polynucleotides selectively hybridize to a 

4 difference sequence at least 80% identical to a sequence as shown in Tables 1-16. 

1 55. A method of detecting a prostate cancer, the method comprising the 

2 steps of: 

3 (i) providing a biological sample from a patient; 

4 (ii) contacting the biological sample with a first polynucleotide that selectively 



5 hybridizes to a sequence at least 80% identical to a first sequence as shown in Tables 1-16 to 

6 determine the level of a prostate cancer-associated transcript in the biological sample; and 

7 with a second polynucleotide that selectively hybridizes to a second sequence at least 80% 

8 identical to a sequence not shown in Tables 1-16; wherein the expression of said second 

9 sequence is not substantially changed in prostate cancer, to detemine the level of expression 
10 of a control transcript in the biological sample; 
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1 1 (iii) comparing the level of the prostate cancer-associated transcript to a level 

12 of the normal tissue associated transcript in the biological sample. 

1 56. A method of quantitating a prostate cancer-associated transcript in a 

2 cell from a patient, the method comprising contacting a biological sample from the patient 

3 with a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 

4 sequence as shown in Tables 1-16. 

1 57. The method of claim 56, wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1-16. 

1 58. The method of claim 56, wherein the biological sample is a tissue 

2 sample. 

1 59. The method of claim 56, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 60. The method of claim 56, wherein the nucleic acids are mRNA. 

1 61. The method of claim 59, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 62. The method of claim 56, wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-16. 

1 63. The method of claim 56, wherein the polynucleotide is labeled. 

1 64. The method of claim 63, wherein the label is a fluorescent label. 

1 65. The method of claim 56, wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 . 66. The method of claim 56, wherein the patient is undergoing a 

2 therapeutic regimen to treat metastatic prostate cancer. 

1 67. The method of claim 56, wherein the patient is suspected of having 

2 metastatic prostate cancer. 
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V 

1 68. A biochip comprising a plurality of polynucleotides that selectively 

2 hybridize to a sequence at least 80% identical to a sequence as shown in Tables 1-16. 

1 69. A method of screening drug candidates comprising: 

2 i) providing a cell that expresses an expression profile gene selected from the 

3 group consisting of an expression profile gene set forth in Tables 1-16 or fragment thereof; 

4 ii) adding a drug candidate to said cell; and 

5 iii) determining the effect of said drug candidate on the expression of said 

6 expression profile gene. 

1 70. A method according to claim 59 wherein said determining comprises 

2 comparing the level of expression in the absence of said drug candidate to the level of 

3 expression in the presence of said drug candidate. 

1 SF 1277890 vl 
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